[英]Regex to match everything except pattern
I am coming from this question . 我来自这个问题 。 Now what I want is the exact opposite.
现在我要的是完全相反的事情。 I want to match all chracters except this pattern:
我想匹配除此模式以外的所有特征:
yearid="[0-9]+"
Why do I do that please? 我为什么要这样做?
I have tried (?!yearid="[0-9]+")
but it refuses to match match. 我已经尝试过
(?!yearid="[0-9]+")
但拒绝匹配。
There are actually two ways to do this. 实际上有两种方法可以做到这一点。 You can use
[^0-9]+
where the ^
negates the term inside the brackets, or \\D+
where \\D
is any non-digit character. 您可以使用
[^0-9]+
其中^
否定括号内的术语),也可以使用\\D+
,其中\\D
是任何非数字字符。
re.sub(r'yearid="[0-9]+"', '', string_to_fix)
Capture the group like normal, then substitute nothing for it, and return the complete string. 像平常一样捕获组,然后用它替代任何内容,然后返回完整的字符串。
Or, if you want to go the hard way and negate it: 或者,如果您想走强硬路线并予以否定:
re.sub(r'(.*?)(?:yearid="[0-9]+")(.*)', '\\1\\2', string_to_fix)
This first matches everything lazily (.*?)
, until it finds the yearid="XXXX"
, matches that as a noncapturing group (?:yearid="[0-9]+")
, then matches everything else (.*)
. 这首先会懒惰地匹配所有内容
(.*?)
,直到找到yearid="XXXX"
,然后将其匹配为非捕获组(?:yearid="[0-9]+")
,然后再匹配其他所有内容(.*)
。 Finally, it replaces the original full string with just the 1st and 2nd capture groups, essentially cutting out the section you want. 最后,它将第一个和第二个捕获组替换为原始的完整字符串,从本质上切出了您想要的部分。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.