如何相对于日志文件中的搜索读取Python的前几行？

Question

我是Python的新手，因此可以尝试使用它。
我有一个很大的文件，在搜索了搜索词组之后，我应该返回n行并获取文本的开头，即开始标记。 之后，从该位置开始阅读。

这些短语可以出现多次。 并且有多个开始标签。 请找到以下示例文件：

<module>
hi
flowers
<name>xxx</name>
<age>46</age>
</module>
<module>
<place>yyyy</place>
<name>janiiiii</janii>
</module>

假定搜索为，搜索完以后，我需要返回到该行。 ＆之间的线会变化，它们不是静态的。 因此，一旦找到名称，就需要返回模块行并开始阅读它。

请找到以下代码：

from itertools import islice
lastiterline=none
line_num=0
search_phrase="Janiii"
with open ('c:\sample.txt',"rb+") as f:
      for line in f:
          line_num+=1
     line=line.strip()
        if line.startswith("<module>"):
           lastiterline=line
           linec=line_num
        elif line find(search_phrase)>=0:
             if lastiterline:
             print line
             print linec

这有助于我获取与搜索到的单词相对应的模块的行号，但是我无法向后移动指针以再次开始从模块读取行。 将有多个搜索词组，因此每次我都需要返回该行而又不破坏主要内容，它会读取整个大文件。

例如：可能有100个模块标签，并且里面可能有10个我想要的搜索短语，所以我只需要这10个模块标签。

Answer 1

好的，这是为您提供的示例，因此您可以根据需要更具体。

这是您的huge_file.txt的示例：

wgoi jowijg
<start tag>
wfejoije jfie
fwjoejo
THE PHRASE
jwieo
<end tag>
wjefoiw wgworjg
<start tag>
wjgoirg 
<end tag>
<start tag>
wfejoije jfie
fwjoejo
woeoj
jwieo
THE PHRASE
<end tag>

还有一个脚本read_prev_lines.py ：

hugefile = open("huge_file.txt", "r")
hugefile = hugefile.readlines()

start_locations = []
current_block = -1
for idx, line in enumerate(hugefile):
  if "<start tag>" in line:
    start_locations.append({"start": idx})
    current_block += 1
  if "THE PHRASE" in line:
    start_locations[current_block]["phr"] = idx
  if "<end tag>" in line:
    start_locations[current_block]["end"] = idx

#for i in phrase_locations:
for idx in range(len(start_locations)):
  if "phr" in start_locations[idx].keys():
    print("Found THE PHRASE after %d start tag(s), at line %d:" % (idx, start_locations[idx]["phr"]))
    print("Here is the whole block that contains the phrase:")
    print(hugefile[start_locations[idx]["start"]: start_locations[idx]["end"]+1])

如何相对于日志文件中的搜索读取Python的前几行？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-02-21 10:48:26

如何相对于日志文件中的搜索读取Python的前几行？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-02-21 10:48:26

解决方案1
0 已采纳 2019-02-21 10:48:26