简体   繁体   中英

Why does re.sub match \ when r-string (r'\\..') is used in re.sub criteria?

txt=r"\\xa3100.00."
print(txt)

Output: \\xa3100.00.

txt="\\xa3100.00."
print(txt)

Output: \xa3100.00.

In the following example, the txt value has got \\ which effectively is \ because \ is an escape character when used without r-string.

So why does re.sub substitute the \\ when I have used r-string in the re.sub search criteria (r-string means there is \\ and not \ in the value)? Why does re.sub match \ when r-string (r'\..') is used in re.sub criteria?

txt="\\xa3100.00."
import re
re.sub(r"\\xa3", r"£", txt)

Output: £100.00.

That happens because \ also has a special meaning in regular expressions, and it means "consider next character 'as is' even if it would be special for the regular expression syntax".

You pattern indeed begins with

 \\xa3

but that means

  • a literal \
  • x
  • a
  • 3

In other words r"\\xa3" means a string with content \\xa3 , while "\\xa3" means a string with content \xa3 ; however the string content \\xa3 for a regular expression means the PATTERN \xa3 .

The backslash \ is used as an escape BOTH for the quoted string and for the regular expression.

You need to use r"\\\\xa3" for example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM