[英]python read specific lines from file and continue
我試圖從文件中讀取特定行,並在結束每個塊的過程之后繼續讀取。 假設我的文件中有19000行。 每次,我將提取前19行,對這些行進行一些計算,然后將輸出寫入另一個文件中。 然后,我將再次提取下19行,並進行相同的處理。 因此,我嘗試通過以下方式提取行:
n=19
x = defaultdict(list)
i=0
fp = open("file")
for next_n_lines in izip_longest(*[fp] *n):
lines = next_n_lines
for i, line in enumerate(lines):
do calculation
write results
這里的代碼適用於第一個塊。 你們中的任何一個都可以幫助我,我如何繼續下一個n個塊? 在此先多謝!
您的代碼已經提取了以19行為一組的行,因此我不確定您的問題是什么。
我可以稍微整理一下您的解決方案,但它與您的代碼具有相同的作用:
from itertools import izip_longest
# grouping recipe from itertools documentation
def grouper(n, iterable, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
def process_chunk(chunk):
"Return sequence of result lines. Chunk must be iterable."
for i, line in enumerate(chunk):
yield 'file-line {1:03d}; chunk-line {0:02d}\n'.format(i, int(line))
yield '----------------------------\n'
這是一些測試代碼,演示了訪問了每一行:
from StringIO import StringIO
class CtxStringIO(StringIO):
def __enter__(self):
return self
def __exit__(self, *args):
return False
infile = CtxStringIO(''.join('{}\n'.format(i) for i in xrange(19*10)))
outfile = CtxStringIO()
# this should be the main loop of your program.
# just replace infile and outfile with real file objects
with infile as ifp, outfile as ofp:
for chunk in grouper(19, ifp, '\n'):
ofp.writelines(process_chunk(chunk))
# see what was written to the file
print ofp.getvalue()
該測試用例應打印如下行:
file-line 000; chunk-line 00
file-line 001; chunk-line 01
file-line 002; chunk-line 02
file-line 003; chunk-line 03
file-line 004; chunk-line 04
...
file-line 016; chunk-line 16
file-line 017; chunk-line 17
file-line 018; chunk-line 18
----------------------------
file-line 019; chunk-line 00
file-line 020; chunk-line 01
file-line 021; chunk-line 02
...
file-line 186; chunk-line 15
file-line 187; chunk-line 16
file-line 188; chunk-line 17
file-line 189; chunk-line 18
----------------------------
您的問題尚不清楚,但我想您所做的計算取決於您提取的所有N行(示例中為19)。
因此,最好提取所有這些行,然后進行工作:
N = 19
inFile = open('myFile')
i = 0
lines = list()
for line in inFile:
lines.append(line)
i += 1
if i == N:
# Do calculations and save on output file
lines = list()
i = 0
此解決方案無需將所有行加載到內存中。
n=19
fp = open("file")
next_n_lines = []
for line in fp:
next_n_lines.append(line)
if len(next_n_lines) == n:
do caculation
next_n_lines = []
if len(next_n_lines) > 0:
do caculation
write results
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.