[英]Python regular expression Understanding Boundary “\b”
input:输入:
- "example (.com)"
output: output:
- "example"
What I tried我试过的
import re
pattern=re.compile(r'\b\([\W\w]+\)\b')
#pattern=re.compile(r'\([\W\w]+\)')
print(pattern.sub("","example (.com)"))
This doesn't work but if I remove \b
it works fine - why?这不起作用,但如果我删除
\b
它工作正常 - 为什么?
Make yourself clear what the \b
stands for:让自己清楚
\b
代表什么:
(?:(?=\w)(?<!\w)|(?<=\w)(?!\w))
That is, either a word character ahead and not behind or the other way round.也就是说,要么是一个单词字符在前面而不是在后面,或者相反。 A word character (
\w
) stands for [A-Za-z0-9_]
and does not include (
or )
, so there's no word boundary between a space and a parenthesis.单词字符 (
\w
) 代表[A-Za-z0-9_]
并且不包括(
or )
,因此空格和括号之间没有单词边界。
(
is not a word boundary, so \b
won't match it. Instead you could use \B
to match the blank (
不是单词边界,因此\b
不会匹配它。相反,您可以使用\B
匹配空白
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.