顯示文件中單個序列的長度

Question

我有一個包含兩個序列的文件。 我有一個程序，可以讀取所有序列，將它們組合在一起，並一起顯示兩個序列的長度。 現在我要單獨顯示長度。 這兩個序列用符號>分隔。

例：

SEQ1 >ATGGGACTAGCAGT

SEQ2  >AGGATGATGAGTGA

程序：

#!usr/bin/python
import re
fh=open('clostp1.fa','r')
count=0
content=fh.readlines()
fh.close()
seq=''
patt=re.compile('>(.*?)')
for item in content:
    m=patt.match(item)
    if not m:
        s=item.replace('\n','')
        seq=seq+s
seq=seq.replace('\s','')       
print seq
print 'The length of the coding sequence of the bacillus' 
print len(seq)

Answer 1

for line in open("clostp1.fa"):
    name, sequence = map(str.strip,line.split('>'))
    print "The length of %s is %s"%(name, len(sequence))

Answer 2

如果我理解正確，您想打印出每個序列及其長度，對嗎？ 我相信您只有一個函數可以返回序列，以后再用它們來做就可以了。

#!usr/bin/python
import re

def get_content(file):
    """
    Returns a dict with the name of the seq and its value
    """
    result = {}
    for current_line in open(file):
        name, value = line.strip().split(">")
        result[name] = value
    return result

您得到了字典，然后打印了需要打印的內容。

Answer 3

for line in open("clostp1.fa"):
    name, _, seq = line.partition('>')
    name, seq = name.rstrip(), seq.rstrip()
    print("The length of {} is {}".format(name, len(seq)))

partition在這里更合適，然后split 。 您需要rstrip每個單獨的部分，並且格式化語法將在py3.1中起作用，請使用

print("The length of {0} is {1}".format(name, len(seq)))

使它在py2.6中工作。

Answer 4

import re
pattern = re.compile('(?P<seqname>\w*)\s*>\s*(?P<seqval>\w*)')
for item in open('clostp1.fa','r').readlines():
    m = pattern.match(item)
    if m:
       print "sequence name: %s - %s length" % (m.groupdict()['seqname'],len(m.groupdict()['seqval']))

顯示文件中單個序列的長度

問題描述

4 個解決方案

解決方案1
4 2009-10-15 07:55:41

解決方案2
1 2009-10-15 08:05:28

解決方案3
0 2009-10-15 08:14:41

解決方案4
0 2009-10-15 08:17:51

顯示文件中單個序列的長度

問題描述

4 個解決方案

解決方案1 4 2009-10-15 07:55:41

解決方案2 1 2009-10-15 08:05:28

解決方案3 0 2009-10-15 08:14:41

解決方案4 0 2009-10-15 08:17:51

解決方案1
4 2009-10-15 07:55:41

解決方案2
1 2009-10-15 08:05:28

解決方案3
0 2009-10-15 08:14:41

解決方案4
0 2009-10-15 08:17:51