[英]Python Regular Expressions matching Email
My question is to write a function that, given an email address (a), returns (user, domain) corresponding to the user name and domain name. 我的问题是编写一个给定电子邮件地址(a)的函数,该函数返回与用户名和域名对应的(用户,域)。 given bob@aus.space.com it should return (bob, aus.space.com).
给定bob@aus.space.com,它应该返回(bob,aus.space.com)。
The function should only match if it meets these following 该功能仅在满足以下条件时才匹配
A domain name must end with an alphabetic character. 域名必须以字母字符结尾。 Alphabetic characters may be uppercase or lowercase.
字母字符可以是大写或小写。 No whitespace characters are allowed.
不允许使用空格字符。
Below is my current code and I am getting invalid syntax errors. 下面是我当前的代码,并且我收到了无效的语法错误。 Any insight on how to do this easier or cleaner would be much appreciated.
对于如何更轻松或更清洁地执行此操作的任何见解都将受到赞赏。
import re
def find_email (s):
re_pattern = (r"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0- 9-.]+$)")
regular_expression_object = re.compile(re_pattern)
match_object = regular_expression_object.match(s)
if match_object != None:
return (match_object.match(s).group('user'),match_object(s).group('domain'))
else:
raise ValueError
Seems this code would be what you are looking for: 似乎此代码将是您想要的:
import re
def find_email (s):
re_pattern = (r"(?P<user>^[a-zA-Z][a-zA-Z0-9_.+-]+)@(?P<domain>[a-zA-Z0-9-._]+[a-zA-Z])$")
regular_expression_object = re.compile(re_pattern)
match_object = regular_expression_object.match(s)
if match_object != None:
return (match_object.group('user'), match_object.group('domain'))
else:
raise ValueError
[In]: find_email("user@email.domain.com")
[Out]: ('user', 'email.domain.com')
If you already made match of object you don't need to call "match" on it again - it already has groups. 如果您已经使对象匹配,则无需再次调用“匹配”-它已经具有组。
Also its good practice to use some kind of regex-helping sites - there are multiple of them, such as regex101 for example. 使用某种形式的正则表达式帮助站点也是一种好习惯-例如,其中有多个,例如regex101。
Edit: Ok changed it up a little. 编辑:好的改变了一点。
Since first character MUST be alphabetic, its [a-zA-Z]
to check for it, followed by [a-zA-Z0-9_.+-]
with +
meaning 1 or more (can change it to *
if you want 1-letter usernames) of alphanumeric characters plus those few special characters you gave in the original post. 由于第一个字符必须为字母,因此必须先检查其
[a-zA-Z]
,然后再输入[a-zA-Z0-9_.+-]
其中+
表示1或更大(如果需要,可以将其更改为*
-字母用户名)以及字母数字字符以及您在原始帖子中输入的一些特殊字符。
After @
, [a-zA-Z0-9-._]+
means 1 or more of characters in this bracket, followed by [a-zA-Z]
forcing the end of the line - $
to end with alphabetic character. @
之后, [a-zA-Z0-9-._]+
表示此括号中的1个或多个字符,然后是[a-zA-Z]
强制行的末尾- $
以字母字符结尾。
If you have still some mails that won't work - check some regex pages for this, with little tinkering it will work. 如果您仍有一些无法使用的邮件-请检查一些正则表达式页面,稍加修改便可以使用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.