简体   繁体   中英

How to read certain lines after you find some text in Python?

I'm reading through a huge file, with chunks of text that contain information I need. The only way to find that information is to search for the "header" of that information, "text" . That's an easy solution:

line1 = f.readline()
if "text" in line1:
  print(":)")

However, I need information out of the next 14 lines of text (specifically, I need 3rd, 12th, 14th, and 15th lines after the line where "text" is found). Currently I'm using

line2 = f.readline()
line3 = f.readline()
...
line15 = f.readline()

But this seems wildly inefficient. Is there a more concise way of doing this? I also need to be able to loop through this, finding each instance of "text" , and the information that follows after. Thank you so much

I typically use a while loop for something like this, with a for loop nested inside:

with open(filename) as f_in:
  while True:
    line = f_in.readline().strip()
    if not line:
      break
    if line == "text":
      data = [f_in.readline().strip() for i in range(15) if i in [2, 11, 13, 14]]

This allows you to avoid loading the entire file before processing it, and is especially useful if you might have extra lines inbetween your data segments that you don't need to load, but will only work correctly if there are not overlapping segments.

Note this code will strip leading and trailing whitespaces from the lines. If you only want to remove the trailing whitespace you can use rstrip() instead. If you want to avoid changing the line at all, you could try a prefix match with startswith() or simply include the newline character(s) in your condition.

If you are sure that there are not going to be any overlapping sections, you could use something like:

lineno = 0
needed = [3, 12, 14, 15] # This may need adjusting to allow for lineno running from 1
found_at = None
for line in open('filename.txt').readlines():  # This will read blocks of lines for speed
    lineno += 1  # Human readable line numbers
    if found_at:
        if (lineno - found_at) in needed:
            print(lineno, line)
        elif (lineno - found_at) > max(needed):
            found_at = None
    elif text in line:
        found_at = lineno

You could also use a complex regex but it is probably not worth the time to construct one and would be a lot less clear.

Try to construct a loop and count your lines. Something link this

rl = []
with opne("your_file") as fd:
  cnt = 25 #let's start outside required line number after text
  for l in fd.readlines():
     cnt += 1
     if "text" in l: # "text" in your line
       cnt = 0       # reset counter
     elif cnt in [3,12,13,14,15]: # if counter is one of lines you want
       rl.append(l)               # record them
print rl

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM