简体   繁体   English

BCBio的GFF解析器问题

[英]Problem with GFF parser by BCBio

I am trying to parse a GFF file using the BCBio GFF parser and I get the following error. 我试图使用BCBio GFF解析器解析GFF文件,我得到以下错误。 Can anybody help me in resolving this problem? 任何人都可以帮我解决这个问题吗?

Traceback (most recent call last): Traceback(最近一次调用最后一次):

 File "gff_parse.py", line 6, in <module>
    for rec in GFF.parse(in_handle):
  File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 709, in parse
  File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 299, in parse_in_parts
  File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 320, in parse_simple
  File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 603, in _gff_process
  File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 634, in _lines_to_out_info
  File "build/bdist.linux-x86_64/egg/BCBio/GFF/GFFParser.py", line 183, in _gff_line_map
ValueError: invalid literal for int() with base 10: 'New Start'

Here is my code: 这是我的代码:

from BCBio import GFF    
in_file = "infile.gff"    
in_handle = open(in_file)
for rec in GFF.parse(in_handle):
    print rec
in_handle.close()

Thanks Tulika 谢谢Tulika

How did you generate the GFF file? 你是如何生成GFF文件的? It appears to contain at least one invalid line. 它似乎包含至少一个无效行。 The fourth column should contain integers for the start coordinate of a feature; 第四列应包含要素的起始坐标的整数; the error message indicates it contains the value 'New Start' instead. 错误消息表明它包含值'New Start'。

The GFF3 specification page has some examples of valid GFF, and the online validator can help with debugging formatting issues like this. GFF3规范页面有一些有效GFF的例子, 在线验证器可以帮助调试这样的格式化问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM