带有 python / re.match 的正则表达式不起作用

Question

I've got stringText like that我有这样的 stringText

sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/> 
"""

I'd like receive:我想收到：

cooking food blog 5 years烹饪美食博客 5 年

I tried many different regex我尝试了许多不同的正则表达式

like:喜欢：

p = re.compile('<mainDescription description=\"([^\"]+)\"\/>')
print re.match(p, sText)

or或者

p = re.compile(ur'<mainDescription description="([^"]+)"\/>')

and using (.+) according regex101.com my regex should work correctly, but it doesn't.并根据regex101.com使用 (.+) 我的正则表达式应该可以正常工作，但事实并非如此。 I have no idea why我不知道为什么

Answer 1

Try use findall():尝试使用 findall()：

print re.findall('<mainDescription description=\"([^\"]+)\"\/>', sText)

Output:输出：

['cooking food blog 5 years']

Answer 2

Seems like it's because you're using re.match() instead of re.search() .似乎是因为您使用的是re.match()而不是re.search() 。 re.match() Searches from the start of the string while re.search() searches anywhere. re.match()从字符串的开头搜索，而re.search()搜索任何地方。 This works:这有效：

sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/> 
"""
p = re.compile('<mainDescription description=\"([^\"]+)\"\/>')
print re.search(p, sText).group(1)

By the way, you do not need to escape the quotation marks ( " ) if you're using ' meaning this is enough:顺便说一句，如果您使用' ，则不需要转义引号 ( " )，这意味着这就足够了：

re.search('<mainDescription description="([^"]+)"/>', sText)

Answer 3

re.match returns a match object, from which you need to retrieve the desired group. re.match返回一个match对象，您需要从中检索所需的组。

sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/> 
"""
r = re.compile("""<mainDescription description="(?P<description>[^"]+)"\/>""")
m = r.match(sText)
print m.group('description')

Note that it's also possible to access the group using the index (0 in this case) but I prefer to specify a keyword.请注意，也可以使用索引（在本例中为 0）访问组，但我更喜欢指定关键字。

带有 python / re.match 的正则表达式不起作用

问题描述

3 个解决方案

解决方案1
1 已采纳 2016-03-13 23:35:54

解决方案2
0 2016-03-13 23:33:14

解决方案3
0 2016-03-13 23:36:31

带有 python / re.match 的正则表达式不起作用

问题描述

3 个解决方案

解决方案1 1 已采纳 2016-03-13 23:35:54

解决方案2 0 2016-03-13 23:33:14

解决方案3 0 2016-03-13 23:36:31

解决方案1
1 已采纳 2016-03-13 23:35:54

解决方案2
0 2016-03-13 23:33:14

解决方案3
0 2016-03-13 23:36:31