简体   繁体   中英

Replace two or more character in a string using single pattern sub function in regular expression python

Replace invalid email address characters using a single regex pattern. Replace "At", "at" with "@", and replace "dot" with "."

Code:

import re

email = "abc at xyz.com, abc At xyz.com, abc (at) xyz [dot] com"
pa = re.compile(r'(\s+[\(\[]*\s*at*\s*[\)\]]*\s+)',flags=re.IGNORECASE)
em = pa.sub(r'@',email)
print(em)

Output

abc@xyz.com, abc@xyz.com, abc@xyz [dot] com

Expected output

abc@xyz.com, abc@xyz.com, abc@xyz.com

How can I replace '[dot]' with '.'?

Requiring the substitution to happen with a single pattern just pushes the problem to a different corner. In brief, the second argument to re.sub can be a function of arbitrary complexity, but then requiring that function to be inlined to a single line seems somewhat disingenuous.

Here, we create a re.sub which uses a simple dictionary to decide what to replace the match with.

import re

email = "abc at xyz.com, abc At xyz.com, abc (at) xyz [dot] com"
pa = re.compile(r'\W*(at|dot)\W*', flags=re.IGNORECASE)
em = pa.sub(lambda m: {'dot': '.', 'at': '@'}[m.group(1).lower()], email)
print(em)

The main trick is to capture just the dictionary key into the parenthesized subexpression, which is then available in .group(1) .

Demo: https://ideone.com/3Llu0i

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM