简体   繁体   English

正则表达式,排除某些单词以外的所有单词

[英]Regex that exclude all except some words

I though that filtering a string like : "Hello <strong>plip</strong> plop" to obtain "plip plop" , that is, excluding all words except 'plip' and 'plop' would be easy with this C# line: new Regex("[^(plip)(plop)]").Replace(inputString,"") . 我虽然过滤了这样的字符串: "Hello <strong>plip</strong> plop"以获取"plip plop" ,也就是说,使用C#行很容易排除除'plip'和'plop'以外的所有单词: new Regex("[^(plip)(plop)]").Replace(inputString,"") Unfortunalty, the excluding brackets [^] seem to not accept exclusion words, as it keeps each letters contained in 'plip' and 'plop' (the result is "llooplipoplop" ). 不幸的是,排除括号[^]似乎不接受排除词,因为它会将每个字母保留在'plip'和'plop'中(结果为"llooplipoplop" )。

Is there a way to achieve this in a single regex/line, or is it necessary to loop other all matches of plip and plop, then concat them? 有没有办法在单个正则表达式/行中实现此目标,还是有必要循环其他所有plip和plop匹配项,然后对其进行组合?

hope this works 希望这有效

(?<=(\bplip\b|\bplop\b|^)).*?(?=(\bplip\b|\bplop\b|$))

You should set the singleline mode for the above regex to work 您应该为上述正则表达式设置singleline模式

works here 在这里工作

Generally speaking, it is much easier to write a regex that matches what you do want than one that matches all the stuff you don't want. 一般而言,编写与所需内容匹配的正则表达式要比与所有不需要的内容匹配的正则表达式容易得多。

In this case you want to "exclude all words except plip and plop ", but why not just include only plip and plop instead? 在这种情况下,您要“排除除plipplop之外的所有单词”,但是为什么不只包括plipplop而已呢?

var input = "Hello <strong>plip</strong> plop";
var matches = Regex.Matches(input, "plip|plop");
var result = string.Join("", matches.Cast<Match>().Select(x => x.Value));

Console.Out.WriteLine(result); // prints "plipplop"

Of course since you asked for a one-liner, you could do everything without the temp variables (and good luck to the next guy reading the code!): 当然,由于您要求使用单行代码,因此您可以在没有temp变量的情况下做所有事情(并祝下一个阅读代码的人好运!):

var result = string.Join("", Regex.Matches("Hello <strong>plip</strong> plop", "plip|plop").Cast<Match>().Select(x => x.Value));

Also, assuming you actual word list is more complicated than plip and plop , you can do something like var pattern = string.Join("|", words); 另外,假设您的实际单词列表比plipplop更复杂,则可以执行var pattern = string.Join("|", words); to construct the pattern. 构建模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM