Python正则表达式：向后引用一个匹配的正则表达式组

Question

I am trying to return 2 subgroups from my regex match: 我想从我的正则表达式匹配项中返回2个子组：

email_add = "John@Doe.com <John@Doe.com>"
m = re.match(r"(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b) <(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)", email_add)

But it doesn't seem to match: 但这似乎不匹配：

>>> m.group()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

I suspect I probably did not group it correctly or I'm using incorrect word boundary. 我怀疑我可能没有正确将其分组，或者我使用了错误的单词边界。 I tried \\w instead of \\b but the result is the same. 我尝试用\\ w代替\\ b，但是结果是一样的。

Could someone please point out my errors. 有人可以指出我的错误。

Answer 1

You are matching uppercase AZ letters only , so the character sequences ohn and oe and com cause the pattern not to match anything. 你是匹配大写AZ 字母，所以字符序列ohn和oe和com导致模式不匹配任何东西。

Adding the re.I case-insensitive flag makes your pattern work: 添加不区分大小写的re.I标志使您的模式有效：

>>> import re
>>> email_add = "John@Doe.com <John@Doe.com>"
>>> re.match(r"(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b) <(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)", email_add)
>>> re.match(r"(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b) <(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)", email_add, re.I)
<_sre.SRE_Match object at 0x1030d4f10>
>>> _.groups()
('John@Doe.com', 'John@Doe.com')

or you could add az to the character classes instead: 或者您可以在字符类中添加az ：

>>> re.match(r"(\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b) <(\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b)", email_add)
<_sre.SRE_Match object at 0x1030d4f10>
>>> _.groups()
('John@Doe.com', 'John@Doe.com')

Answer 2

What's wrong with your regex has been pointed out, but you may also want to consider email.utils.parseaddr : 指出了您的正则表达式有什么问题，但您可能还需要考虑email.utils.parseaddr ：

>>> from email.utils import parseaddr
>>> email_add = "John@Doe.com <John@Doe.com>"
>>> parseaddr(email_add)
('', 'John@Doe.com')  # doesn't get first part, so could assume it's same as 2nd?
>>> email_add = "John Doe <John@Doe.com>"
>>> parseaddr(email_add)
('John Doe', 'John@Doe.com') # does get name and email

Python正则表达式：向后引用一个匹配的正则表达式组

问题描述

2 个解决方案

解决方案1
2 已采纳 2013-03-01 17:12:56

解决方案2
2 2013-03-01 17:19:13

Python正则表达式：向后引用一个匹配的正则表达式组

问题描述

2 个解决方案

解决方案1 2 已采纳 2013-03-01 17:12:56

解决方案2 2 2013-03-01 17:19:13

解决方案1
2 已采纳 2013-03-01 17:12:56

解决方案2
2 2013-03-01 17:19:13