在Python中修改文本文件中的每一行

Question

I have a big file like below example: 我有一个大文件，例如以下示例：

1   10161   10166   3
1   10166   10172   2
1   10172   10182   1
1   10183   10192   1
1   10193   10199   1
1   10212   10248   1
1   10260   10296   1
1   11169   11205   1
1   11336   11372   1
2   11564   11586   2
2   11586   11587   3
2   11587   11600   4
3   11600   11622   2

I would like to add a "chr" at the beginning of each line, for example: 我想在每行的开头添加一个“ chr”，例如：

chr1    10161   10166   3
chr1    10166   10172   2
chr1    10172   10182   1
chr1    10183   10192   1
chr1    10193   10199   1
chr1    10212   10248   1
chr1    10260   10296   1
chr1    11169   11205   1
chr1    11336   11372   1
chr2    11564   11586   2
chr2    11586   11587   3
chr2    11587   11600   4
chr3    11600   11622   2

I tried the following code in python: 我在python中尝试了以下代码：

   file = open("myfile.bg", "r")
   for line in file: 
      newline = "chr" + line
   out = open("outfile.bg", "w")
   for new in newline:
      out.write("n"+new)

but did not return what I wanted. 但没有返回我想要的。 do you know how to fix the code for this purpose? 您知道如何为此目的修复代码吗？

Answer 1

The problem with your code is that you iterate over the input file without doing anything with the data you read: 代码的问题是，您遍历输入文件而不对读取的数据进行任何处理：

file = open("myfile.bg", "r")
for line in file: 
    newline = "chr" + line

the last line assigns each line in myfile.bg to the newline variable (a string, with 'chr' prepended), each line overwriting the previous result. 最后一行将myfile.bg中的每一行分配给newline变量（一个带有'chr'的字符串），每一行都覆盖先前的结果。

Then you iterate over the string in newline (which will be the last line in the input file, with 'chr' prepended): 然后，您遍历newline的字符串（这将是输入文件中的最后一行，并带有'chr'前缀）：

out = open("outfile.bg", "w")
for new in newline:       # <== this iterates over a string, so `new` will be individual characters
    out.write("n"+new)    # this only writes 'n' before each character in newline

If you're just doing this once, eg in the shell, you could use the one-liner: 如果仅在外壳中执行一次，则可以使用单线：

open('outfile.bg', 'w').writelines(['chr' + line for line in open('myfile.bg').readlines()])

more correct (especially in a program, where you would care about open file handles etc.) would be: 更正确（尤其是在程序中，您会关心打开文件句柄等的程序）将是：

with open('myfile.bg') as infp:
    lines = infp.readlines()
with open('outfile.bg', 'w') as outfp:
    outfp.writelines(['chr' + line for line in lines])

if the file is really big (close to the size of your available memory), you'll need to process it incrementally: 如果该文件是真正的大（接近可用内存的大小），你需要逐步处理它：

with open('myfile.bg') as infp:
    with open('outfile.bg', 'w') as outfp:
        for line in infp:
            outfp.write('chr' + line)

(this is much slower than the first two versions though..) （尽管这比前两个版本要慢得多。）

Answer 2

Totally agree with @rychaza, here's my version using your code 完全同意@rychaza，这是使用您的代码的我的版本

file = open("myfile.bg", "r")
out = open("outfile.bg", "w")
for line in file:
    out.write("chr" + line)
out.close()
file.close()

Answer 3

The issue is you are iterating the input and re-setting the same variable ( newline ) for every line, then opening a file for output and iterating newline which is a string, so new will be each character in that string. 问题是您要遍历输入并为每行重新设置相同的变量（ newline ），然后打开文件进行输出并遍历作为字符串的newline ，因此new将是该字符串中的每个字符。

I think something like this should be what you're looking for: 我认为您需要的是这样的东西：

with open('myfile.bg','rb') as file:
  with open('outfile.bg','wb') as out:
    for line in file:
      out.write('chr' + line)

When iterating a file, line should already contain the trailing newline. 迭代文件时，该line应已包含尾随换行符。

The with statements will automatically clean up the file handle when the block ends. 块结束时， with语句将自动清理文件句柄。

在Python中修改文本文件中的每一行

问题描述

3 个解决方案

解决方案1
1 2017-10-04 18:11:26

解决方案2
1 2017-10-04 18:12:21

解决方案3
0 2017-10-04 18:09:44

在Python中修改文本文件中的每一行

问题描述

3 个解决方案

解决方案1 1 2017-10-04 18:11:26

解决方案2 1 2017-10-04 18:12:21

解决方案3 0 2017-10-04 18:09:44

解决方案1
1 2017-10-04 18:11:26

解决方案2
1 2017-10-04 18:12:21

解决方案3
0 2017-10-04 18:09:44