简体   繁体   中英

Using python re.sub, but it replace the start and end unexpected

I have this string a = "a:b/c\\" and I want to replace : / \\ to _ together

This is my code

b = re.sub(r'[:/\\]*', '_', a)

However, the result is ''_a__b__c__'' and I think it should be a_b_c_ but this method replace the start and end together, how could I change this?

a = "a:b/c\\"
b = re.sub(r'[:/\\]*', '_', a)
print(b)

You're using a character class [] which matches any single character from within that class. However this presents two issues in your particular scenario:

  1. You've got a two-character-long pattern you're trying to match \\
  2. You've quantified it with a * , which means "zero or more matches" - at its core your pattern will now basically match on anything since this character class you've declared is now effectively optional.

The solution here is to (a) use a group and alternatives instead of a character class, and (b) eliminate the misused * quantifier:

import re
a = "a:b/c\\"
b = re.sub(r'(:|/|\\)', '_', a)
print(b) # 'a_b_c_'

Regex101 - this differs slightly because the tool itself does not respect the raw r'' string that Python uses to eliminate the need for escaping the backslash \ characters, regardless it illustrates fundamentally what's happening here.

I have change re.sub(r'[:|/|\\]*', '_', a) to re.sub(r'[:|/|\\]+', '_', a) this problem solved, + means it need to exist 1 or more.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM