简体   繁体   English

正则表达式匹配除模式以外的所有内容

[英]Regex to match everything except pattern

I am coming from this question . 我来自这个问题 Now what I want is the exact opposite. 现在我要的是完全相反的事情。 I want to match all chracters except this pattern: 我想匹配除此模式以外的所有特征:

yearid="[0-9]+"

Why do I do that please? 我为什么要这样做?

I have tried (?!yearid="[0-9]+") but it refuses to match match. 我已经尝试过(?!yearid="[0-9]+")但拒绝匹配。

There are actually two ways to do this. 实际上有两种方法可以做到这一点。 You can use [^0-9]+ where the ^ negates the term inside the brackets, or \\D+ where \\D is any non-digit character. 您可以使用[^0-9]+其中^否定括号内的术语),也可以使用\\D+ ,其中\\D是任何非数字字符。

re.sub(r'yearid="[0-9]+"', '', string_to_fix)

Capture the group like normal, then substitute nothing for it, and return the complete string. 像平常一样捕获组,然后用它替代任何内容,然后返回完整的字符串。

Or, if you want to go the hard way and negate it: 或者,如果您想走强硬路线并予以否定:

re.sub(r'(.*?)(?:yearid="[0-9]+")(.*)', '\\1\\2', string_to_fix)

This first matches everything lazily (.*?) , until it finds the yearid="XXXX" , matches that as a noncapturing group (?:yearid="[0-9]+") , then matches everything else (.*) . 这首先会懒惰地匹配所有内容(.*?) ,直到找到yearid="XXXX" ,然后将其匹配为非捕获组(?:yearid="[0-9]+") ,然后再匹配其他所有内容(.*) Finally, it replaces the original full string with just the 1st and 2nd capture groups, essentially cutting out the section you want. 最后,它将第一个和第二个捕获组替换为原始的完整字符串,从本质上切出了您想要的部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM