简体   繁体   English

正则表达式 Python 第二场比赛

[英]REGEX Python Second Match

I am trying to extract the second match to "LOCATION \s+\S+" from the following text:我正在尝试从以下文本中提取与“LOCATION \s+\S+”的第二个匹配项:

 PAGE    1
​
                BID OPENING DATE    07/25/18    FROM 0.2 MILES WEST OF ICE HOUSE        07/26/18 CONTRACT NUMBER    03-2F1304   ROAD TO 0.015 MILES WEST OF CONTRACT CODE 'A '
​
            LOCATION    03-ED-50-39.5/48.7  DIVISION HIGHWAY ROAD   44 CONTRACT ITEMS
​
        INSTALL SANDTRAPS AND PULLOUTS  FEDERAL AID ACNH-P050-(146)E
​
PAGE    1
​
                    BID OPENING DATE    07/25/18    IN EL DORADO COUNTY AT VARIOUS          07/26/18 CONTRACT NUMBER     03-2H6804  LOCATIONS ALONG ROUTES 49 AND 193   CONTRACT CODE 'C ' LOCATION 03-ED-0999-VAR          13 CONTRACT ITEMS
​
​
​
        TREE REMOVAL    FEDERAL AID NONE
​
PAGE    1
​
                BID OPENING DATE    07/25/18    IN LOS ANGELES, INGLEWOOD AND       07/26/18 CONTRACT NUMBER    07-296304   CULVER CITY, FROM I-105 TO PORT CONTRACT CODE 'B '
​
            LOCATION    07-LA-405-R21.5/26.3    ROAD UNDERCROSSING  55 CONTRACT ITEMS
​
​
​
        ROADWAY SAFETY IMPROVEMENT  FEDERAL AID ACIM-405-3(056)E

I am trying to get LOCATION 03-ED-0999-VAR (second match) from the text.我正在尝试从文本中获取LOCATION 03-ED-0999-VAR (第二场比赛)。 Is there a way to specify that we want the second or the third or the nth match in python?有没有办法指定我们想要 python 中的第二个或第三个或第 n 个匹配项? Right now, I have the following code:现在,我有以下代码:

# imports
import os
import pandas as pd
import re
import docx2txt
import textract
import antiword

text = ' PAGE    1

                BID OPENING DATE    07/25/18    FROM 0.2 MILES WEST OF ICE HOUSE        07/26/18 CONTRACT NUMBER    03-2F1304   ROAD TO 0.015 MILES WEST OF CONTRACT CODE 'A '

            LOCATION    03-ED-50-39.5/48.7  DIVISION HIGHWAY ROAD   44 CONTRACT ITEMS

        INSTALL SANDTRAPS AND PULLOUTS  FEDERAL AID ACNH-P050-(146)E

PAGE    1

                    BID OPENING DATE    07/25/18    IN EL DORADO COUNTY AT VARIOUS          07/26/18 CONTRACT NUMBER     03-2H6804  LOCATIONS ALONG ROUTES 49 AND 193   CONTRACT CODE 'C ' LOCATION 03-ED-0999-VAR          13 CONTRACT ITEMS



        TREE REMOVAL    FEDERAL AID NONE

PAGE    1

                BID OPENING DATE    07/25/18    IN LOS ANGELES, INGLEWOOD AND       07/26/18 CONTRACT NUMBER    07-296304   CULVER CITY, FROM I-105 TO PORT CONTRACT CODE 'B '

            LOCATION    07-LA-405-R21.5/26.3    ROAD UNDERCROSSING  55 CONTRACT ITEMS



        ROADWAY SAFETY IMPROVEMENT  FEDERAL AID ACIM-405-3(056)E'

location1 = re.search('LOCATION \s+\S+', text)

Instead of using re.search() you could try using re.findall() instead.您可以尝试使用re.findall()而不是使用re.search() This will get you all the matches in form of a list and you could pick whichever you'd like and even count how many you got.这将以列表的形式为您提供所有匹配项,您可以选择您想要的任何一个,甚至可以计算您得到了多少。

location1 = re.findall("LOCATION \s+\S+", text)
print(len(location1)) # To print how many matches there are
print(location1[1]) # To print second match

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM