简体   繁体   English

需要用正则表达式来替换所有用字母或数字包围的符号

[英]Need regex to replace all symbols surround by a letters or numbers only

I need a regex to replace all symbols surround by a letters or numbers only. 我需要一个正则表达式来仅用字母或数字替换所有符号。 With a space, I'll be using C# to run the expression and I'm OK with the part just stuck on the regex part. 留一个空格,我将使用C#运行表达式,我可以将部分卡在regex部分上。

So after the replacement the following 所以更换后如下

  1. Type-01 would be Type 01 类型01将是类型01
  2. 01 )* would still be 01 )* 01 )*仍为01 )*
  3. -Category:Toys would still be -Category:Toys -类别:玩具仍然是- 类别:玩具
  4. White:Back would be White Black 白色:背面白色黑色

Current Expression 当前表达

(?<=\w)[^a-zA-Z0-9Category:]+(?=\w)

Input string is 输入字符串为

-Category:Toys AND (Teddy Bear Type-01*) OR (Teddy Bear White:Black) -类别:玩具AND(泰迪熊Type-01 *)或(泰迪熊白色:黑色)

Required output 所需输出

-Category:Toys AND (Teddy Bear Type 01*) OR (Teddy Bear White Black) -类别:玩具AND(泰迪熊类型01 *)或(泰迪熊白色黑色)

But what I'm getting is 但是我得到的是

-Category:Toys AND Teddy Bear Type 01 OR Teddy Bear White:Black) -类别:玩具和泰迪熊类型01或泰迪熊白色:黑色)

Not sure if I'm just missing some thing simple or just got the wrong end of the stick 不知道我是否只是在错过一些简单的事情或只是错误地坚持了下来

You can't put words into a character class. 您不能将单词放入字符类。 All characters there will be added to that class, the order doesn't matter. 那里的所有字符都将添加到该类中,顺序无关紧要。

I am not sure if it is sufficient for you, but for your example, this will work: 我不确定这是否对您足够,但是对于您的示例,这将起作用:

(?<=\w)[^a-zA-Z0-9*:()\s]+(?=\w)

and replace with a single space. 并替换为一个空格。

I would make it also more Unicode style: 我也会使它更具有Unicode样式:

(?<=\w)[^\p{L}0-9*:()\s]+(?=\w)

Where \\p{L} is a Unicode property for a letter in any language. 其中\\p{L}是任何语言字母的Unicode属性。

See it here on Regexr 在Regexr上查看

Update: 更新:

If you want to keep the colon if there is "Category" before you could do it like this 如果要保留冒号(如果有“类别”),则可以这样操作

(?<=\w)(?:[^a-zA-Z0-9*()\s:]+|(?<!Category):)(?=\w)

See it on Regexr 在Regexr上查看

I added the colon to the negated character class to say don't replace the colon. 我在否定的字符类中添加了冒号,以表示不要替换冒号。 Then I added an alternative to say: replace the colon, but only if there is not "Category" before. 然后,我添加了另一种说法:替换冒号,但前提是之前没有“ Category”。

For C#, you can use the Regex.Replace function. 对于C#,可以使用Regex.Replace函数。

string a = "Category:Toys AND (Teddy Bear Type-01*) OR (Teddy Bear White/Black)";
string s = string.Empty;
s = Regex.Replace(a, @"[^()*:A-Za-z0-9]", " ");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM