[英]Read each line of a text file and then split each line by spaces in python
[英]Modifying each line in a text file in Python
我有一个大文件,例如以下示例:
1 10161 10166 3
1 10166 10172 2
1 10172 10182 1
1 10183 10192 1
1 10193 10199 1
1 10212 10248 1
1 10260 10296 1
1 11169 11205 1
1 11336 11372 1
2 11564 11586 2
2 11586 11587 3
2 11587 11600 4
3 11600 11622 2
我想在每行的开头添加一个“ chr”,例如:
chr1 10161 10166 3
chr1 10166 10172 2
chr1 10172 10182 1
chr1 10183 10192 1
chr1 10193 10199 1
chr1 10212 10248 1
chr1 10260 10296 1
chr1 11169 11205 1
chr1 11336 11372 1
chr2 11564 11586 2
chr2 11586 11587 3
chr2 11587 11600 4
chr3 11600 11622 2
我在python中尝试了以下代码:
file = open("myfile.bg", "r")
for line in file:
newline = "chr" + line
out = open("outfile.bg", "w")
for new in newline:
out.write("n"+new)
但没有返回我想要的。 您知道如何为此目的修复代码吗?
代码的问题是,您遍历输入文件而不对读取的数据进行任何处理:
file = open("myfile.bg", "r")
for line in file:
newline = "chr" + line
最后一行将myfile.bg
中的每一行分配给newline
变量(一个带有'chr'
的字符串),每一行都覆盖先前的结果。
然后,您遍历newline
的字符串(这将是输入文件中的最后一行,并带有'chr'
前缀):
out = open("outfile.bg", "w")
for new in newline: # <== this iterates over a string, so `new` will be individual characters
out.write("n"+new) # this only writes 'n' before each character in newline
如果仅在外壳中执行一次,则可以使用单线:
open('outfile.bg', 'w').writelines(['chr' + line for line in open('myfile.bg').readlines()])
更正确(尤其是在程序中,您会关心打开文件句柄等的程序)将是:
with open('myfile.bg') as infp:
lines = infp.readlines()
with open('outfile.bg', 'w') as outfp:
outfp.writelines(['chr' + line for line in lines])
如果该文件是真正的大(接近可用内存的大小),你需要逐步处理它:
with open('myfile.bg') as infp:
with open('outfile.bg', 'w') as outfp:
for line in infp:
outfp.write('chr' + line)
(尽管这比前两个版本要慢得多。)
完全同意@rychaza,这是使用您的代码的我的版本
file = open("myfile.bg", "r")
out = open("outfile.bg", "w")
for line in file:
out.write("chr" + line)
out.close()
file.close()
问题是您要遍历输入并为每行重新设置相同的变量( newline
),然后打开文件进行输出并遍历作为字符串的newline
,因此new
将是该字符串中的每个字符。
我认为您需要的是这样的东西:
with open('myfile.bg','rb') as file:
with open('outfile.bg','wb') as out:
for line in file:
out.write('chr' + line)
迭代文件时,该line
应已包含尾随换行符。
块结束时, with
语句将自动清理文件句柄。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.