简体   繁体   English

从字符串中提取一个词

[英]Extract a word from string

I am looking a way with regex to extract the word "MONT" from a sentence without space. 我正在寻找一种用正则表达式从没有空格的句子中提取单词“ MONT”的方法。 And I would like to extract the next number after "WORD 我想提取“ WORD”之后的下一个数字

For example : 例如 :

s = valoirfinalieMONT:23maning => MONT 23

s = montdj34meaing  => mont 34

s = thisisthelastmontitwillwork98help => mont 98

Thanks for your help 谢谢你的帮助

try this: 尝试这个:

import re

s='valoirfinalieMONT:23maning '
print(re.findall('(mont)\D*(\d*)', s, re.IGNORECASE))

the regex will capture 'mont' any number of non digit(\\D) characters and then any number of digits(\\d) 正则表达式将捕获“ mont”任意数量的非数字(\\ D)字符,然后捕获任意数量的数字(\\ d)

the ignore case is added so mont and MONT and MoNt and such will also be captured 添加了忽略大小写的情况,因此mont以及MONT和MoNt也会被捕获

You can also try like this. 您也可以这样尝试。

re.I is for case insensitive match. re.I用于不区分大小写的匹配。 You can check https://docs.python.org/3/library/re.html for more details. 您可以查看https://docs.python.org/3/library/re.html了解更多详细信息。

import re

s = "valoirfinalieMONT:23maning"
s2 = "montdj34meaing"
s3 = "thisisthelastmontitwillwork98help"

m = re.match(r".*(?P<name>mont)\D+(?P<number>\d+).*", s, re.I)
print(m.group(1)) # MONT
print(m.group(2)) # 23

# Same as above (2nd way)
print(m.group('name'));
print(m.group('number'))

m2 = re.match(r".*(?P<name>mont)\D+(?P<number>\d+).*", s2, re.I)
print(m2.group(1)) # mont
print(m2.group(2)) # 34

m3 = re.match(r".*(?P<name>mont)\D+(?P<number>\d+).*", s3, re.I)
print(m3.group(1)) # mont
print(m3.group(2)) # 98

Here is the solution to your question that you mentioned in comment. 这是您在评论中提到的问题的解决方案。

>>> import re
>>>
>>> s = 'valoir13-10-2012finalie13/10/2012MONT:23,00maning';
>>> m = re.match(r".*(\d{2}-\d{2}-\d{4}).*(\d{2}/\d{2}/\d{4}).*(MONT).*(\d{2},\
d{2})", s, re.I)
>>> m
<_sre.SRE_Match object; span=(0, 43), match='valoir13-10-2012finalie13/10/2012M
ONT:23,00'>
>>>
>>> m.group(0)
'valoir13-10-2012finalie13/10/2012MONT:23,00'
>>>
>>> d = m.group(1)
>>> d
'13-10-2012'
>>> arr = d.split("-")
>>> arr
['13', '10', '2012']
>>>
>>> '-'.join(arr[:2] + [arr[2][-2:]])
'13-10-12'
>>>
>>> ans1 = '-'.join(arr[:2] + [arr[2][-2:]])
>>> ans1
'13-10-12'
>>>
>>> ans2 = m.group(2)
>>> ans2
'13/10/2012'
>>>
>>> ans3 = m.group(3)
>>> ans3
'MONT'
>>>
>>> ans4 = m.group(4)
>>> ans4
'23,00'
>>>
>>> output = ' '.join([ans1, ans2, ans3, ans4])
>>> output
'13-10-12 13/10/2012 MONT 23,00'
>>>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM