I have downloaded a list of coordinates of yeast genes from Xu et al., 2009 (see table S3). Unfortunately its current format is not a standard format so it does not appear to be compatible with the programs I would like to use i.e. HOMER, bedops or bedtools. I was wondering if anyone could help me get it into a gff format using unix or R (other languages are also welcome if the code is just copy and paste)? I tried to recreate what I saw at the ensembl website, but said programs were still not recognizing it as gff. Here is the beginning of the file (there are actually ~7K lines):
ID chr strand start end type name commonName endConfidence source
ST0001 1 + 9369 9601 SUTs SUT001 SUT001 bothEndsMapped Manual
ST0002 1 + 30073 30905 CUTs CUT001 CUT001 bothEndsMapped Automatic
ST0003 1 + 31153 32985 ORF-T YAL062W GDH3 bothEndsMapped Manual
ST0004 1 + 33361 34897 ORF-T YAL061W BDH2 bothEndsMapped Manual
ST0005 1 + 35097 36393 ORF-T YAL060W BDH1 bothEndsMapped Manual
ST0006 1 + 36545 37329 ORF-T YAL059W ECM1 bothEndsMapped Manual
ST0007 1 + 37409 39033 ORF-T YAL058W CNE1 bothEndsMapped Manual
ST0008 1 ...
↧