简体   繁体   English

在python中将字符串从一个很长的文件写入另一个文件

[英]writing a string from one very long file to another file in python

Please do not behead me for my noob question. 请不要因为我的菜鸟问题而将我斩首。 I have looked up many other questions on stackoverflow concerning this topic, but haven't found a solution that works as intended. 我已经在关于此主题的stackoverflow上查找了许多其他问题,但是还没有找到可以正常使用的解决方案。

The Problem: I have a fairly large txt-file (about 5 MB) that I want to copy via readlines() or any other build in string-handling function into a new file. 问题:我有一个很大的txt文件(大约5 MB),我想通过readlines()或任何其他内置的字符串处理函数将其复制到一个新文件中。 For smaller files the following code sure works (only schematically coded here): 对于较小的文件,请确保以下代码有效(仅在此处进行示意性编码):

f = open('C:/.../old.txt', 'r');
n = open('C:/.../new.txt', 'w');
for line in f:
    print(line, file=n);

However, as I found out here ( UnicodeDecodeError: 'charmap' codec can't encode character X at position Y: character maps to undefined ), internal restrictions of Windows prohibit this from working on larger files. 但是,正如我在这里发现的那样( UnicodeDecodeError:'charmap'编解码器无法在位置Y处编码字符X:字符映射到undefined ),Windows的内部限制禁止它在较大的文件上运行。 So far, the only solution I came up with is the following: 到目前为止,我想出的唯一解决方案是:

f = open('C:/.../old.txt', 'r', encoding='utf8', errors='ignore');
n = open('C:/.../new.txt', 'a');
for line in f:
    print(line, file=sys.stderr) and append(line, file='C:/.../new.txt');   

f.close();
n.close();

But this doesn't work. 但这是行不通的。 I do get a new.txt-file, but it is empty. 我确实得到了一个new.txt文件,但是它是空的。 So, how do I iterate through a long txt-file and write every line into a new txt-file? 那么,如何遍历一个长的txt文件并将每一行写到一个新的txt文件中呢? Is there a way to read the sys.stderr as the source for the new file (I actually don't have any idea, what this sys.stderr is)? 有没有一种方法可以读取sys.stderr作为新文件的源(我实际上不知道sys.stderr是什么)? I know this is a noob question, but I don't know where to look for an answer anymore. 我知道这是一个菜鸟问题,但我不知道该在哪里寻找答案了。

Thanks in advance! 提前致谢!

There is no need to use print() just write() to the file: 无需使用print()即可write()文件:

with open('C:/.../old.txt', 'r') as f, open('C:/.../new.txt', 'w') as n:
    n.writelines(f)

However, it sounds like you may have an encoding issue, so make sure that both files are opened with the correct encoding. 但是,听起来您可能遇到了编码问题,因此请确保两个文件都以正确的编码打开。 If you provide the error output perhaps more help can be provided. 如果提供错误输出,则可能会提供更多帮助。

BTW: Python doesn't use ; 顺便说一句:Python不使用; as a line terminator, it can be used to separate 2 statements if you want to put them on the same line but this is generally considered bad form. 作为行终止符,如果要将两个语句放在同一行上,则可用于分隔两个语句,但这通常被认为是错误的形式。

You can set standard output to file like my code. 您可以将标准输出设置为类似于我的代码的文件。 I successfully copied 6MB text file with this. 我成功复制了6MB的文本文件。

import sys

bigoutput = open("bigcopy.txt", "w")
sys.stdout = bigoutput
with open("big.txt", "r") as biginput:
    for bigline in biginput.readlines():
        print(bigline.replace("\n", ""))
bigoutput.close()

您为什么不只使用shutil模块并复制文件?

you can try with this code it works for me. 您可以尝试使用此代码,它对我有用。

with open("file_path/../large_file.txt") as f:
    with open("file_path/../new_file", "wb") as new_f:
            new_f.writelines(f.readlines())
            new_f.close()
    f.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM