简体   繁体   中英

Whole words in python regular expression

How do I find whole words using regular expressions in Python? I use Beautiful soup and re library to parse a document. In soup I need to find all contents after word 'E-mail'. I try

for sublink in link.findAll(text = re.compile("[E-mail:0-9a-zA-Z]")):
         print sublink.encode('utf-8') 

But it does not work.

Here is a working example for word extraction via regular expressions:

import re

text = "First line\n" + \
    "Second line\n" + \
    "Important line! E-mail:mail@domain.de, Phone:991\n" + \
    "Another important line! E-mail:tom@gmail.com, Phone:001\n" + \
    "Another line"
print text

emails = re.findall("E-mail:([\w@.-]+)", text)
print "Found email(s): " + ', '.join(emails)

Output:

Found email(s): mail@domain.de, tom@gmail.com

Not sure if that's what you are looking for.

Edit: The characters 0-9a-zA-Z can be written as \\w . And yes, I added . and - . Simply put them into [\\w@.-] if there are more possible characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM