简体   繁体   中英

Skip chunks of lines while reading a file Python

I have a file which consists of curve data repetitively structured as following:

numbersofsamples
Title
     data
     data
     data
      ...

For example:

999numberofsamples
title crvTitle
             0.0            0.866423
    0.0001001073           0.6336382
    0.0002002157           0.1561626
    0.0003000172          -0.1542121
             ...                 ...
1001numberofsamples
title nextCrv
    0.000000e+00        0.000000e+00
    1.001073e-04        1.330026e+03
    2.002157e-04        3.737352e+03
    3.000172e-04        7.578963e+03
             ...                 ...

The file consists of many curves and can be up to 2GB.

My task is to find and export a specific curve by skipping the chunks (curves) that are not interesting for me. I know the length of the curve (number of samples), so there should be a way to jump to the next delimiter (eg numberofsamples) until I find the title that I need?

I tried to use an iterator to do that, unfortunately without any success. Is that the right way to accomplish the task?

If it's possible, I don't want to save the data to the memory.

This is a general way to skip lines you don't care about:

for line in file:
    if 'somepattern' not in line:
        continue
    # if we got here, 'somepattern' is in the line, so process it

You don't need to keep all lines in memory. Skip to the wanted title, and only save the liens afterwards, you want:

with open('somefile.txt') as lines
    # skip to title
    for line in lines
        if line == 'title youwant':
            break
    numbers = []
    for line in lines:
        if 'numberofsamples' in line:
            break # next samples
        numbers.append(line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM