With my code, I loop over files and count patterns in files. My code is as follows
from collections import defaultdict
import csv, os, re
from itertools import groupby
import glob
def count_kmers(read, k):
counts = defaultdict(list)
num_kmers = len(read) - k + 1
for i in range(num_kmers):
kmer = read[i:i+k]
if kmer not in counts:
counts[kmer] = 0
counts[kmer] += 1
for item in counts:
return(basename, sequence, item, counts[item])
for fasta_file in glob.glob('*.fasta'):
basename = os.path.splitext(os.path.basename(fasta_file))[0]
with open(fasta_file) as f_fasta:
for k, g in groupby(f_fasta, lambda x: x.startswith('>')):
if k:
sequence = next(g).strip('>\n')
else:
d1 = list(''.join(line.strip() for line in g))
d2 = ''.join(d1)
complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
reverse_complement = "".join(complement.get(base, base) for base in reversed(d1))
d3 = list(''.join(line.strip() for line in reverse_complement))
d4 = ''.join(d3)
d5 = (d2+d4)
counting = count_kmers(d5, 5)
with open('kmer.out', 'a') as text_file:
text_file.write(counting)
And my output looks like this
1035 1 GAGGA 2
1035 1 CGCAT 1
1035 1 TCCCG 1
1035 1 CTCAT 2
1035 1 CCTGG 2
1035 1 GTCCA 1
1035 1 CATGG 1
1035 1 TAGCC 2
1035 1 GCTGC 7
1035 1 TGCAT 1
The code works fine, but I cannot write my output to a file. I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-190-89e3487da562> in <module>()
37 counting = count_kmers(d5, 5)
38 with open('kmer.out', 'w') as text_file:
---> 39 text_file.write(counting)
TypeError: write() argument must be str, not tuple
What am I doing wrong and how can I solve this problem, to make sure that my code write the output to a txt file?
The original verions of count_kmers()
did not contain a return
statement, which means it has an implicit return None
.
As you assign this to counting
all of your errors became self explanatory.
After your edit, the end of the function looked like this:
for item in counts:
return(basename, sequence, item, counts[item])
which returns a tuple with four values. It also exits the function on the first pass through the loop.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.