[英]Regex to validate words do not contain numbers or special characters
I am developing a java app, running on android. 我正在开发一个在Android上运行的Java应用程序。 I am trying to pick all words which do not contain any embedded digits or symbols.
我正在尝试选择所有不包含任何嵌入式数字或符号的单词。
The best I have come up with is: 我想出的最好的是:
\b[a-zA-Z]+[a-zA-Z]*+\b
Test Data: 测试数据:
this is a test , an0ther gr8 WW##ee one, w1n 1test test1 end
This results in picking the following: this, is, a, test, WW##ee, one, end
结果是选择了以下内容:
this, is, a, test, WW##ee, one, end
I need to eliminate the WW##ee
from the results. 我需要从结果中删除
WW##ee
。
You shouldn't use a word boundary meta-character \\b
since it matches the position right after WW
which sees a hash #
character. 您不应该使用单词边界元字符
\\b
因为它与WW
后面WW
#
字符的位置相匹配。 This position is a word boundary itself. 此位置本身就是单词边界。 So you should pick up a different way:
因此,您应该采用另一种方式:
(?<![\S&&[^,]])[a-zA-Z]+(?![\S&&[^,]])
Using character class intersection feature of Java's regex you are able to define punctuation characters that are allowed to follow or precede a word character. 使用Java的regex的字符类交集功能,您可以定义允许在单词字符之后或之前出现的标点符号。 Here it is a comma
,
. 这是一个逗号
,
。
您可以使用“向后看”和“向前看”来检查是否没有#
。
\b(?<!\#)[a-zA-Z]+(?!\#)\b
My solution has evolved a bit as I have gotten additional help with this. 随着我获得更多帮助,我的解决方案有所发展。 So, this is now my best solution but still a bit lacking.
因此,这是我目前最好的解决方案,但仍然有点不足。 I have not been able to accept "as-is" while rejecting "-this-" and a similar case of accept "and/or" while rejecting "/slash/".
我无法在拒绝“ -this-”的同时接受“原样”,而在拒绝“ / slash /”的同时接受“和/或”的类似情况。 Also for simplicity I have made the input data single word per line.
同样为了简单起见,我使输入数据每行一个字。
^(?:[\\p{P}\\p{S}]) ?((?:[\\p{L}\\p{Pd}'])+)(?:[\\p{P}\\p{S}]) $ ^(?:[\\ p {P} \\ p {S}]) ?((?:[\\ p {L} \\ p {Pd}'])+)(?:[\\ p {P} \\ p { S}]) $
as-is is picked valid 原样被选择为有效
-this- is valid but I wish it weren't -这是有效的,但我希望不是
and/or is not valid but I wish it would be picked 和/或无效,但我希望它将被选中
/slash/ "slash" is picked valid / slash /选择“ slash”有效
(test) "test" is picked valid (测试)“测试”被选为有效
[test] "test" is picked valid [测试]“测试”被选为有效
<test> "test" is picked valid <test>“测试”被选择为有效
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.