Similar to this question, but slightly more complex
I have a large txt file, that looks something like this:
" AAAAAAAAAAAAAA.BBBBBBBBBBBBBB.CCCCCCCCCCCCCC.DDDDDDDDDDDDDD.EEEEEEEEEEEEEE.FFFFFFFFFFFFFF.GGGGGGGGGGGGGG.HHHHHHHHHHHHHH.IIIIIIIIIIIIII.JJJJJJJJJJJJJJ.KKKKKKKKKKKKKK. "
Each line break is a ".", the file ends in a linebreak, each line is exactly 14 characters long. GollyJer's answer to the mentioned question is good, but I have a few extra requirements:
I can't have the real txt be loaded into RAM as it's over 600GB
I don't know where to begin with altering the code to do this. Is this even possible? How can I do this? Thanks
I might explore the walrus operator to clean this up and I really have no idea if this is going to be "fast enough". The idea is to read upto the point you want. read/print the stuff to delete then read the rest:
line_to_delete = 2
with open("in.txt", "rt") as file_in:
with open("out.txt", "wt") as file_out:
file_out.write(file_in.read(15 * (line_to_delete -1)))
print(file_in.read(15))
file_out.write(file_in.read())
I think that might be memory intensive so you might produce a more streamy result by doing:
line_to_delete = 2
with open("in.txt", "rt") as file_in:
current_line = 1
with open("out.txt", "wt") as file_out:
while True:
line = file_in.read(15)
if not line:
break
if current_line == line_to_delete:
print(line)
else:
file_out.write(line)
current_line += 1
both print BBBBBBBBBBBBBB.
and produce a file like:
AAAAAAAAAAAAAA.CCCCCCCCCCCCCC.DDDDDDDDDDDDDD.EEEEEEEEEEEEEE.FFFFFFFFFFFFFF.GGGGGGGGGGGGGG.HHHHHHHHHHHHHH.IIIIIIIIIIIIII.JJJJJJJJJJJJJJ.KKKKKKKKKKKKKK.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.