I have a file which consists of curve data repetitively structured as following:
numbersofsamples
Title
data
data
data
...
For example:
999numberofsamples
title crvTitle
0.0 0.866423
0.0001001073 0.6336382
0.0002002157 0.1561626
0.0003000172 -0.1542121
... ...
1001numberofsamples
title nextCrv
0.000000e+00 0.000000e+00
1.001073e-04 1.330026e+03
2.002157e-04 3.737352e+03
3.000172e-04 7.578963e+03
... ...
The file consists of many curves and can be up to 2GB.
My task is to find and export a specific curve by skipping the chunks (curves) that are not interesting for me. I know the length of the curve (number of samples), so there should be a way to jump to the next delimiter (eg numberofsamples) until I find the title that I need?
I tried to use an iterator to do that, unfortunately without any success. Is that the right way to accomplish the task?
If it's possible, I don't want to save the data to the memory.
This is a general way to skip lines you don't care about:
for line in file:
if 'somepattern' not in line:
continue
# if we got here, 'somepattern' is in the line, so process it
You don't need to keep all lines in memory. Skip to the wanted title, and only save the liens afterwards, you want:
with open('somefile.txt') as lines
# skip to title
for line in lines
if line == 'title youwant':
break
numbers = []
for line in lines:
if 'numberofsamples' in line:
break # next samples
numbers.append(line)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.