简体   繁体   English

Python,在匹配前后提取3行

[英]Python, Extracting 3 lines before and after a match

I am trying to figure out how to extract 3 lines before and after a matched word.我想弄清楚如何在匹配的单词前后提取 3 行。

At the moment, my word is found.此刻,我的话找到了。 I wrote up some text to test my code.我写了一些文本来测试我的代码。 And, I figured out how to print three lines after my match.而且,我想出了如何在比赛后打印三行。

But, I am having difficulty trying to figure out how to print three lines before the word, " secure ".但是,我很难弄清楚如何在“ secure ”这个词之前打印三行。

Here is what I have so far:这是我到目前为止所拥有的:

from itertools import islice
with open("testdoc.txt", "r") as f:
for line in f:
    if "secure" in line:
        print("".join(line))
        print ("".join(islice(f,3)))

Here is the text I created for testing:这是我为测试创建的文本:

----------------------------
 This is a test to see
if i can extract information
using this code
I hope, I try, 
maybe secure shell will save thee
Im adding extra lines to see my output
hoping that it comes out correctly
boy im tired, sleep is nice
until then, time will suffice

i came up with this solution, just adding the previous lines in a list, and deleting the first one after 4 elements我想出了这个解决方案,只需在列表中添加前几行,然后删除 4 个元素后的第一行

from itertools import islice

with open("testdoc.txt", "r") as f:
    linesBefore = list()
    for line in f:
        linesBefore.append(line.rstrip())
        if len(linesBefore) > 4: #Adding up to 4 lines
            linesBefore.pop(0)
        if "secure" in line:
            if len(linesBefore) == 4: # if there are at least 3 lines before the match
                for i in range(3):
                    print(linesBefore[i])
            else: #if there are less than 3 lines before the match
                print(''.join(linesBefore))
            print("".join(line.rstrip()))
            print ("".join(islice(f,3)))

You need to buffer your lines so you can recall them.您需要缓冲您的行,以便您可以调用它们。 The simplest way is to just load all the lines into a list:最简单的方法是将所有行加载到列表中:

with open("testdoc.txt", "r") as f:
    lines = f.readlines()  # read all lines into a list
    for index, line in enumerate(lines):  # enumerate the list and loop through it
        if "secure" in line:  # check if the current line has your substring
            print(line.rstrip())  # print the current line (stripped off whitespace)
            print("".join(lines[max(0,index-3):index]))  # print three lines preceeding it

But if you need maximum storage efficiency you can use a buffer to store the last 3 lines as you loop over the file line by line.但是,如果您需要最大的存储效率,您可以在逐行循环文件时使用缓冲区来存储最后 3 行。 A collections.deque is ideal for that. collections.deque是理想的选择。

如果我需要对多次出现的“安全”一词执行相同的操作,并且只取最后一次出现并提取上下两行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM