简体   繁体   English

在文本文件中搜索多行字符串并返回 Python 中的行号

[英]Search a text file for a multi line string and return line number in Python

I am trying to search through a text file for and match part (or all) of the text on two separate lines.我正在尝试在文本文件中搜索并匹配两行上的部分(或全部)文本。 I need to return the line number (within the text file) of the matching string (the first line).我需要返回匹配字符串(第一行)的行号(在文本文件中)。

An example text file could be:一个示例文本文件可以是:

This is some text on the first line这是第一行的一些文字
Here is some more or the second line这是更多或第二行
This third line has more text.这第三行有更多的文字。

If I tried to find the following string " second line This third line " it would return a line number of 2 (or really 1 if 0 is the first line).如果我试图找到以下字符串“第二行第三行”,它将返回行号 2(如果 0 是第一行,则返回真正的 1)。

I have looked at many similar examples and it seems that I should use the re package, however I cannot workout how to return the line number (either Python - Find line number from text file , Python regex: Search across multilines , re.search Multiple lines Python I have looked at many similar examples and it seems that I should use the re package, however I cannot workout how to return the line number (either Python - Find line number from text file , Python regex: Search across multilines , re.search Multiple线 Python

This code finds the string across multiple lines此代码跨多行查找字符串

import re

a = open('example.txt','r').read()
if re.findall('second line\nThis third line', a, re.MULTILINE):
    print('found!')

The code below reads the text file in loop line by line.下面的代码逐行读取循环中的文本文件。 I realise it will not find a match for the multiline string because it is reading one line at a time.我意识到它不会找到多行字符串的匹配项,因为它一次读取一行。

with open('example.txt') as f:
    for line_no, line in enumerate(f):
        if line == 'second line\nThis third line':
            print ('String found on line: ' + str(line_no))
            break
    else: # for loop ended => line not found
        line_no = -1
        print ('\nString Not found')

Question: How do i get the code in my first example to return the line number of the text file or place this code is some sort of loop that counts the lines?问题:我如何在我的第一个示例中获取代码以返回文本文件的行号,或者将此代码放置为某种对行进行计数的循环?

Use .count() and the match object to count the number of newlines before the match:使用.count()match object 来计算匹配前的换行数:

import re

with open('example.txt', 'r') as file:
    content = file.read()
match = re.search('second line\nThis third line', content)
if match:
    print('Found a match starting on line', content.count('\n', 0, match.start()))

match.start() is the position of the start of the match in content . match.start()content中比赛开始的 position 。

content.count('\n', 0, match.start()) counts the number of newlines in content between character position 0 and the start of the match. content.count('\n', 0, match.start())计算字符 position 0和匹配开始之间content中的换行数。

Use 1 + content.count('\n', 0, match.start()) if you prefer line numbers to start at 1 instead of 0.如果您希望行号从 1 而不是 0 开始,请使用1 + content.count('\n', 0, match.start())

You would either need the whole content as string ( file.read() ) or could try:您要么需要将整个内容作为字符串( file.read() ),要么可以尝试:

found = None
for idx, line in enumerate(your_file_pointer_here):
    if "second line" in line:
    # or line.endswith()
        found = idx
    elif "This third line" in line:
    # or line.startswith()
        if found and (idx - 1) == found:
            print("Found the overall needle at {}".format(found))

This maybe work for you:这可能对你有用:

import re

a = open('example.txt','r').read()
if re.findall('second line\nThis third line', a, re.MULTILINE):
    print('found!')

with open('example.txt') as f:
    count = 0
    line1 = 'second line\nThis third line'
    line1 = line1.split('\n')
    found = 0
    not_found = 0
    for line_no, line in enumerate(f):
        if line1[count] in line :
            count += 1
            if count == 1 :
                found = line_no
            if count == len(line1):
                not_found = 1
                print ('String found on line: ' + str(found))
        elif count > 0 :
            count = 0
            if line1[count] in line :
                count += 1
                if count == 1 :
                    found = line_no
                if count == len(line1):
                    not_found = 1
                    print ('String found on line: ' + str(found))
    if not_found == 0 : # for loop ended => line not found
        line_no = -1
        print ('\nString Not found')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM