简体   繁体   English

匹配电子邮件的Python正则表达式

[英]Python Regular Expressions matching Email

My question is to write a function that, given an email address (a), returns (user, domain) corresponding to the user name and domain name. 我的问题是编写一个给定电子邮件地址(a)的函数,该函数返回与用户名和域名对应的(用户,域)。 given bob@aus.space.com it should return (bob, aus.space.com). 给定bob@aus.space.com,它应该返回(bob,aus.space.com)。

The function should only match if it meets these following 该功能仅在满足以下条件时才匹配

A domain name must end with an alphabetic character. 域名必须以字母字符结尾。 Alphabetic characters may be uppercase or lowercase. 字母字符可以是大写或小写。 No whitespace characters are allowed. 不允许使用空格字符。

Below is my current code and I am getting invalid syntax errors. 下面是我当前的代码,并且我收到了无效的语法错误。 Any insight on how to do this easier or cleaner would be much appreciated. 对于如何更轻松或更清洁地执行此操作的任何见解都将受到赞赏。

import re
def find_email (s):
  re_pattern = (r"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0- 9-.]+$)")
  regular_expression_object = re.compile(re_pattern)
  match_object = regular_expression_object.match(s)
  if match_object != None:
    return (match_object.match(s).group('user'),match_object(s).group('domain'))
  else:
    raise ValueError

Seems this code would be what you are looking for: 似乎此代码将是您想要的:

import re
def find_email (s):
    re_pattern = (r"(?P<user>^[a-zA-Z][a-zA-Z0-9_.+-]+)@(?P<domain>[a-zA-Z0-9-._]+[a-zA-Z])$")
    regular_expression_object = re.compile(re_pattern)
    match_object = regular_expression_object.match(s)
    if match_object != None:
        return (match_object.group('user'), match_object.group('domain'))
    else:
        raise ValueError

[In]: find_email("user@email.domain.com")
[Out]: ('user', 'email.domain.com')

If you already made match of object you don't need to call "match" on it again - it already has groups. 如果您已经使对象匹配,则无需再次调用“匹配”-它已经具有组。

Also its good practice to use some kind of regex-helping sites - there are multiple of them, such as regex101 for example. 使用某种形式的正则表达式帮助站点也是一种好习惯-例如,其中有多个,例如regex101。

Edit: Ok changed it up a little. 编辑:好的改变了一点。

Since first character MUST be alphabetic, its [a-zA-Z] to check for it, followed by [a-zA-Z0-9_.+-] with + meaning 1 or more (can change it to * if you want 1-letter usernames) of alphanumeric characters plus those few special characters you gave in the original post. 由于第一个字符必须为字母,因此必须先检查其[a-zA-Z] ,然后再输入[a-zA-Z0-9_.+-]其中+表示1或更大(如果需要,可以将其更改为* -字母用户名)以及字母数字字符以及您在原始帖子中输入的一些特殊字符。

After @ , [a-zA-Z0-9-._]+ means 1 or more of characters in this bracket, followed by [a-zA-Z] forcing the end of the line - $ to end with alphabetic character. @之后, [a-zA-Z0-9-._]+表示此括号中的1个或多个字符,然后是[a-zA-Z]强制行的末尾- $以字母字符结尾。

If you have still some mails that won't work - check some regex pages for this, with little tinkering it will work. 如果您仍有一些无法使用的邮件-请检查一些正则表达式页面,稍加修改便可以使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM