讀取文件 Python 時跳過行塊

Question

我有一個文件，其中包含重復結構如下的曲線數據：

numbersofsamples
Title
     data
     data
     data
      ...

例如：

999numberofsamples
title crvTitle
             0.0            0.866423
    0.0001001073           0.6336382
    0.0002002157           0.1561626
    0.0003000172          -0.1542121
             ...                 ...
1001numberofsamples
title nextCrv
    0.000000e+00        0.000000e+00
    1.001073e-04        1.330026e+03
    2.002157e-04        3.737352e+03
    3.000172e-04        7.578963e+03
             ...                 ...

該文件由許多曲線組成，最大可達 2GB。

我的任務是通過跳過我不感興趣的塊（曲線）來查找和導出特定曲線。 我知道曲線的長度（樣本數），所以應該有一種方法可以跳轉到下一個分隔符（例如 numberofsamples），直到找到我需要的標題？

我試圖使用迭代器來做到這一點，不幸的是沒有任何成功。 這是完成任務的正確方法嗎？

如果可能，我不想將數據保存到內存中。

Answer 1

這是跳過您不關心的行的一般方法：

for line in file:
    if 'somepattern' not in line:
        continue
    # if we got here, 'somepattern' is in the line, so process it

Answer 2

您不需要將所有行都保留在內存中。 跳到想要的標題，然后只保存留置權，你想要：

with open('somefile.txt') as lines
    # skip to title
    for line in lines
        if line == 'title youwant':
            break
    numbers = []
    for line in lines:
        if 'numberofsamples' in line:
            break # next samples
        numbers.append(line)

讀取文件 Python 時跳過行塊

問題描述

2 個解決方案

解決方案1
1 2018-11-03 18:04:43

解決方案2
0 2018-11-03 18:15:58

讀取文件 Python 時跳過行塊

問題描述

2 個解決方案

解決方案1 1 2018-11-03 18:04:43

解決方案2 0 2018-11-03 18:15:58

解決方案1
1 2018-11-03 18:04:43

解決方案2
0 2018-11-03 18:15:58