简体   繁体   English

Python 正则表达式 理解边界“\b”

[英]Python regular expression Understanding Boundary “\b”

input:输入:

- "example (.com)"

output: output:

- "example"

What I tried我试过的

import re
pattern=re.compile(r'\b\([\W\w]+\)\b')
#pattern=re.compile(r'\([\W\w]+\)')
print(pattern.sub("","example (.com)"))

This doesn't work but if I remove \b it works fine - why?这不起作用,但如果我删除\b它工作正常 - 为什么?

Make yourself clear what the \b stands for:让自己清楚\b代表什么:

(?:(?=\w)(?<!\w)|(?<=\w)(?!\w))

That is, either a word character ahead and not behind or the other way round.也就是说,要么是一个单词字符在前面而不是在后面,或者相反。 A word character ( \w ) stands for [A-Za-z0-9_] and does not include ( or ) , so there's no word boundary between a space and a parenthesis.单词字符 ( \w ) 代表[A-Za-z0-9_]并且不包括( or ) ,因此空格和括号之间没有单词边界。

( is not a word boundary, so \b won't match it. Instead you could use \B to match the blank (不是单词边界,因此\b不会匹配它。相反,您可以使用\B匹配空白

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM