[英]regex with python / re.match doesn't work
I've got stringText like that我有这样的 stringText
sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/>
"""
I'd like receive:我想收到:
cooking food blog 5 years
烹饪美食博客 5 年
I tried many different regex我尝试了许多不同的正则表达式
like:喜欢:
p = re.compile('<mainDescription description=\"([^\"]+)\"\/>')
print re.match(p, sText)
or或者
p = re.compile(ur'<mainDescription description="([^"]+)"\/>')
and using (.+) according regex101.com my regex should work correctly, but it doesn't.并根据regex101.com使用 (.+) 我的正则表达式应该可以正常工作,但事实并非如此。 I have no idea why
我不知道为什么
Try use findall():尝试使用 findall():
print re.findall('<mainDescription description=\"([^\"]+)\"\/>', sText)
Output:输出:
['cooking food blog 5 years']
Seems like it's because you're using re.match()
instead of re.search()
.似乎是因为您使用的是
re.match()
而不是re.search()
。 re.match()
Searches from the start of the string while re.search()
searches anywhere. re.match()
从字符串的开头搜索,而re.search()
搜索任何地方。 This works:这有效:
sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/>
"""
p = re.compile('<mainDescription description=\"([^\"]+)\"\/>')
print re.search(p, sText).group(1)
By the way, you do not need to escape the quotation marks ( "
) if you're using '
meaning this is enough:顺便说一句,如果您使用
'
,则不需要转义引号 ( "
),这意味着这就足够了:
re.search('<mainDescription description="([^"]+)"/>', sText)
re.match
returns a match
object, from which you need to retrieve the desired group. re.match
返回一个match
对象,您需要从中检索所需的组。
sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/>
"""
r = re.compile("""<mainDescription description="(?P<description>[^"]+)"\/>""")
m = r.match(sText)
print m.group('description')
Note that it's also possible to access the group using the index (0 in this case) but I prefer to specify a keyword.请注意,也可以使用索引(在本例中为 0)访问组,但我更喜欢指定关键字。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.