[英]Regex to remove characters and supplied words
I need a regex that will allow only alphanumeric characters AND also remove certain full-words. 我需要一个只允许字母数字字符的正则表达式,并删除某些全字。
Example: 例:
Input string: this-is-johny-bravo's-grand-dad 输入字符串: 这是johny-bravo的祖父
Result string: johny-bravos-dad 结果字符串: johny-bravos-dad
Words/characters to replace by an empty string: this,is,',grand 要用空字符串替换的单词/字符:this,is,',grand
Here is what I have so far: 这是我到目前为止:
var input = "this-is-johny-bravo's-grand-dad";
var regex = new Regex(@"([^a-z0-9\-][\b(this|is|grand)\b]?)");
var result = regex.Replace(input, "");
The result seems to not have the apostrophe but unfortunately still includes the rejected full-words. 结果似乎没有撇号,但不幸的是仍然包括被拒绝的全字。
您还需要将字符类添加到交替:
new Regex(@"\b(this|is|grand)\b-?|[^a-z0-9-]");
Your expression is too complicated. 你的表情太复杂了。 Try
尝试
\b(this|is|grand|')\b-?
Also, and that is the root cause of your problem: Character classes are not for alternation. 此外,这是您的问题的根本原因:字符类不是为了交替。 This
[\\b(this|is|grand)\\b]
is syntactically equivalent to this [()adghinrst|]
. 这个
[\\b(this|is|grand)\\b]
在句法上等同于这个[()adghinrst|]
。
Thinking about it, you probably want this: 考虑一下,你可能想要这个:
(\b(this|is|grand)\b|[^a-z0-9-])-?
Break-down: 分解:
( # group 1 \b(this|is|grand)\b # any of these words | # or [^a-z0-9-] # any character except one of these ) # end group 1 -? # optional dash at the end
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.