[英]Match any special characters (including underscore, but not space) that are not letters
I want to match any special characters that are not numbers or letters (that people use to write words).我想匹配任何不是数字或字母(人们用来写字)的特殊字符。 I want to include underscore because underscore is neither a number nor a letter that is used in words.我想包括下划线,因为下划线既不是数字也不是单词中使用的字母。 But I do not want to include space.但我不想包括空格。
In short, I want to match everyone below except the last two.简而言之,我想匹配除最后两个之外的每个人。
12345_678
12345*678
12345-678
12345&678
12345-678
12345あ678
12345 678
I could not use [^a-zA-Z0-9]
because it does not include non-Latin letters such as Japanese.我不能使用[^a-zA-Z0-9]
因为它不包括非拉丁字母,如日语。 \\d+(\\W|_)\\d+
got the unwanted space. \\d+(\\W|_)\\d+
得到了不需要的空间。 What would be the best regular expression for this?什么是最好的正则表达式?
使用以下也忽略日语字母:
[^a-zA-Z\d\s-ゟ゠-ヿ一-龯]
The following regex will match any character that is neither an alphanumeric character (including characters of different alphabets such as those used in Japan or Korea) nor a space.以下正则表达式将匹配既不是字母数字字符(包括不同字母表的字符,例如在日本或韩国使用的字符)也不是空格的任何字符。
([^\w ]|_)
Note the alteration explicitly matching the underscore character, which is necessary since the underscore is part of the \\w character class and thus would not be matched by [^\\w ]
alone.请注意显式匹配下划线字符的更改,这是必要的,因为下划线是 \\w 字符类的一部分,因此不会单独由[^\\w ]
匹配。 (Also note that the pattern possesses a space character after \\w) (另请注意,该模式在 \\w 之后有一个空格字符)
If not just simple space characters but any other white-space characters (such as the tab character, for example) should be excluded from the match, too, then the following slightly modified pattern might be more appropriate:如果不只是简单的空格字符而且任何其他空白字符(例如制表符)也应该从匹配中排除,那么以下稍微修改的模式可能更合适:
([^\w\s]|_)
( See here for an example of the latter pattern in action on regexstorm.net, including Hiragana and Hangul characters ) ( 请参阅此处了解 regexstorm.net 上的后一种模式示例,包括平假名和韩文字符)
You may want to look at Unicode Character Categories .您可能需要查看Unicode 字符类别。 It seems that you need to match for Symbols and Punctuation .似乎您需要匹配Symbols和Punctuation 。
var regexPattern = @"[\p{S}\p{P}]";
Symbols include +, -, =, <, $, ^, ¦, § etc符号包括 +、-、=、<、$、^、|、§ 等
Punctuation include _, —, (, {, ", », !, ?, #, * etc标点符号包括 _, —, (, {, ", », !, ?, #, * 等
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.