[英]How to efficiently read a large file with a custom newline character using Python?
[英]How to efficiently read and delete a specific line of a large file with a custom newline character using Python (3.9 preferred)?
我有一个很大的 txt 文件,看起来像这样:
"
每个换行符都是一个“.”,文件以换行符结束,每行恰好14个字符长。 GollyJer 对上述问题的回答很好,但我有一些额外的要求:
我无法将真正的 txt 加载到 RAM 中,因为它超过 600GB
我不知道从哪里开始更改代码来执行此操作。 这可能吗? 我怎样才能做到这一点? 谢谢
我可能会探索海象操作员来清理它,但我真的不知道这是否会“足够快”。 这个想法是阅读到你想要的程度。 读取/打印要删除的内容,然后读取 rest:
line_to_delete = 2
with open("in.txt", "rt") as file_in:
with open("out.txt", "wt") as file_out:
file_out.write(file_in.read(15 * (line_to_delete -1)))
print(file_in.read(15))
file_out.write(file_in.read())
我认为这可能是 memory 密集的,因此您可以通过执行以下操作来产生更流畅的结果:
line_to_delete = 2
with open("in.txt", "rt") as file_in:
current_line = 1
with open("out.txt", "wt") as file_out:
while True:
line = file_in.read(15)
if not line:
break
if current_line == line_to_delete:
print(line)
else:
file_out.write(line)
current_line += 1
都打印BBBBBBBBBBBBBB.
并生成如下文件:
AAAAAAAAAAAAAA.CCCCCCCCCCCCCC.DDDDDDDDDDDDDD.EEEEEEEEEEEEEE.FFFFFFFFFFFFFF.GGGGGGGGGGGGGG.HHHHHHHHHHHHHH.IIIIIIIIIIIIII.JJJJJJJJJJJJJJ.KKKKKKKKKKKKKK.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.