I have the BIG data text file for example:
#01textline1
1 2 3 4 5 6
2 3 5 6 7 3
3 5 6 7 6 4
4 6 7 8 9 9
1 2 3 6 4 7
3 5 7 7 8 4
4 6 6 7 8 5
3 4 5 6 7 8
4 6 7 8 8 9
..
..
You do not need a loop to accomplish your purpose. Just use the index
function on the list to get the index of the two lines and take all the lines between them.
Note that I changed your file.readlines()
to strip trailing newlines.
(Using file.read().splitlines()
can fail, if read()
ends in the middle of a line of data.)
file1 = open("data.txt","r")
file2=open("newdata.txt","w")
lines = [ line.rstrip() for line in file1.readlines() ]
firstIndex = lines.index("#02textline2")
secondIndex = lines.index("#03textline3")
print firstIndex, secondIndex
file2.write("\n".join(lines[firstIndex + 1 : secondIndex]))
file1.close()
file2.close()
There is a line return character at the end of every line, so this:
if line == "#03textline3":
will never be true, as the line is actually "#03textline3\\n"
. Why didn't you use the same syntax as the one you used for "#02textline2"
? It would have worked:
if "#03textline3" in line: # Or ' line == "#03textline3\n" '
break;
Besides, you have to correct your indentation for the always_print = True
line.
Here's what I would suggest doing:
firstKey = "#02textline2"
secondKey = "#03textline3"
with open("data.txt","r") as fread:
for line in fread:
if line.rstrip() == firstKey:
break
with open("newdata.txt","w") as fwrite:
for line in fread:
if line.rstrip() == secondKey:
break
else:
fwrite.write(line)
This approach takes advantage of the fact that Python treats files like iterators. The first for
loops iterates through the file iterator f
until the first key is found. The loop breaks, but the iterator stays as the current position. When it gets picked back up, the second loops starts where the first let off. We then directly write the lines you want to a new file, and discard the rest
Advantages:
This does not load the entire file into memory, only the lines between firstKey
and secondKey
are stored, and only the lines before secondKey
are ever read by the script
No entries are looked over or processed more than once
The context manager with
is a safer way to consume files
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.