简体   繁体   English

Bcbio-gff 文件创建问题

[英]Bcbio-gff File creation issue

When creating a file using GFF.write(), i get a new line with "annotation remark" as a source, followed by ASCII encoding of sequence regions:使用 GFF.write() 创建文件时,我得到一个以“annotation remark”为源的新行,然后是序列区域的 ASCII 编码:

##gff-version 3
##sequence-region NC_011594.1 1 16779
NC_011594.1 annotation  remark  1   16779   .   .   .   gff-version=3;sequence-region=%28%27NC_011594.1%27%2C 0%2C 16971%29,%28%27NC_042493.1%27%2C 0%2C 132544852%29, (continues on and on)
NC_011594.1 RefSeq  gene    1   1531    .   +   .   Dbxref=GeneID:7055888;ID=gene-COX1;Name=COX1;gbkey=Gene;gene=COX1;gene_biotype=protein_coding

Any idea why it's here, what it's for and how i could avoid it?知道为什么它在这里,它的用途以及我如何避免它? I fear it might become a problem when using it in third-party softwares.我担心在第三方软件中使用它可能会成为问题。

I imported only the bcbio-gff package, but I believe it's part of Biopython, link: https://biopython.org/wiki/GFF_Parsing我只导入了 bcbio-gff package,但我相信它是 Biopython 的一部分,链接: https://biopython.org/wiki/GFF_Parsing

To your first question - "Why it is there?"对于您的第一个问题 - “为什么它在那里?”

  • I only presume, that by default the package author wanted to export as much information as possible.我只是假设,默认情况下 package 作者想要导出尽可能多的信息。

To your next question - "How can I avoid it?"对于你的下一个问题——“我怎样才能避免它?”

  • Unfortunately there is no off switch.不幸的是没有关闭开关。 For me the solution was to remove any annotations from the exported sequences.对我来说,解决方案是从导出的序列中删除所有注释。 (ie set the annotations attribute to empty dictionary before calling the GFF.write() . (即在调用GFF.write()之前将annotations属性设置为空字典。

Example:例子:

from Bio import SeqIO
from BCBio import GFF

g = SeqIO.read('NC_003888.3.gb','gb')

g.annotations = {}

with open('t2.gff', 'w') as f:
    GFF.write([g], f)

Output file head - no # annotation remark Output 文件头-无# annotation remark

head t2.gff 
##gff-version 3
##sequence-region NC_003888.3 1 8667507
NC_003888.3 feature source  1   8667507 ... removed for clarity ....

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM