简体   繁体   中英

Extract a word from string

I am looking a way with regex to extract the word "MONT" from a sentence without space. And I would like to extract the next number after "WORD

For example :

s = valoirfinalieMONT:23maning => MONT 23

s = montdj34meaing  => mont 34

s = thisisthelastmontitwillwork98help => mont 98

Thanks for your help

try this:

import re

s='valoirfinalieMONT:23maning '
print(re.findall('(mont)\D*(\d*)', s, re.IGNORECASE))

the regex will capture 'mont' any number of non digit(\\D) characters and then any number of digits(\\d)

the ignore case is added so mont and MONT and MoNt and such will also be captured

You can also try like this.

re.I is for case insensitive match. You can check https://docs.python.org/3/library/re.html for more details.

import re

s = "valoirfinalieMONT:23maning"
s2 = "montdj34meaing"
s3 = "thisisthelastmontitwillwork98help"

m = re.match(r".*(?P<name>mont)\D+(?P<number>\d+).*", s, re.I)
print(m.group(1)) # MONT
print(m.group(2)) # 23

# Same as above (2nd way)
print(m.group('name'));
print(m.group('number'))

m2 = re.match(r".*(?P<name>mont)\D+(?P<number>\d+).*", s2, re.I)
print(m2.group(1)) # mont
print(m2.group(2)) # 34

m3 = re.match(r".*(?P<name>mont)\D+(?P<number>\d+).*", s3, re.I)
print(m3.group(1)) # mont
print(m3.group(2)) # 98

Here is the solution to your question that you mentioned in comment.

>>> import re
>>>
>>> s = 'valoir13-10-2012finalie13/10/2012MONT:23,00maning';
>>> m = re.match(r".*(\d{2}-\d{2}-\d{4}).*(\d{2}/\d{2}/\d{4}).*(MONT).*(\d{2},\
d{2})", s, re.I)
>>> m
<_sre.SRE_Match object; span=(0, 43), match='valoir13-10-2012finalie13/10/2012M
ONT:23,00'>
>>>
>>> m.group(0)
'valoir13-10-2012finalie13/10/2012MONT:23,00'
>>>
>>> d = m.group(1)
>>> d
'13-10-2012'
>>> arr = d.split("-")
>>> arr
['13', '10', '2012']
>>>
>>> '-'.join(arr[:2] + [arr[2][-2:]])
'13-10-12'
>>>
>>> ans1 = '-'.join(arr[:2] + [arr[2][-2:]])
>>> ans1
'13-10-12'
>>>
>>> ans2 = m.group(2)
>>> ans2
'13/10/2012'
>>>
>>> ans3 = m.group(3)
>>> ans3
'MONT'
>>>
>>> ans4 = m.group(4)
>>> ans4
'23,00'
>>>
>>> output = ' '.join([ans1, ans2, ans3, ans4])
>>> output
'13-10-12 13/10/2012 MONT 23,00'
>>>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM