简体   繁体   中英

Search pattern in text

I have developed this code to find a pattern in a text:

pattern = re.compile(r'\: (\d{2})/(\d{2})/(\d{4})')
match = re.search(pattern, txt)

My pattern is a date like this: dd/mm/yyyy . The problem is the following: In the text might appear two dates but I want to get just one. The difference between both is the text before the date. I mean:

text1: dd/mm/yyyy
text2: dd/mm/yyyy

I just want to get the date with the text2 before. How can I do that?

Use text2 in the pattern and capture the date subpattern:

import re
txt = """text1: 12/05/2015
text2: 22/05/2016"""
pattern = re.compile(r'text2:\s*(\d{2}/\d{2}/\d{4})')
match = re.search(pattern, txt)
if match:
    print(match.group(1))

See the Python demo

Details :

  • text2: - a literal substring
  • \\s* - 0+ whitespaces
  • (\\d{2}/\\d{2}/\\d{4}) - capturing group 1 that matches 2 digits, / , 2 digits, / and then 4 digits.

The re.search method will find the first match, and if found, we need to get the contents of the first capturing group ( match.group(1) ).

You could put every date you find in list, and then take the last one.

list_of_dates = []
pattern = re.compile(r'\: (\d{2})/(\d{2})/(\d{4}))') 

for date in pattern.finditer(txt):
   list_of_dates.append(date.group(1)) # Take date as back reference

list_of_dates[-1] # This would give you last date

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM