简体   繁体   English

Java Regex标点符号数字和数字后的字母

[英]Java Regex punctuation numbers and letters after numbers

I have text that I want to parse, It contains punctuation words letters etc. So far I used: 我有要解析的文本,其中包含标点符号,字母等。到目前为止,我使用了:

[\d\s\;\:\.\,\)\(]

And it seemed to work perfectly for me until I noticed that it was also picking up the "th" at the end of dates. 直到我注意到它在日期结尾也获得了“ th”,它似乎对我来说是完美的。 (eg. 16th February). (例如2月16日)。

How could I modify my current regex to make it work. 如何修改当前的正则表达式以使其正常工作。 I tried playing around with the whole ^ = start of string and $ = end of string but could not figure it out. 我尝试使用整个^ =字符串开头和$ =字符串结尾,但无法弄清楚。

[\\d\\s;:.,)(]

You shouldn't need to escape things like the colon and semi-colon, but you may need to double escape the d and s in java. 您不需要转义冒号和分号之类的东西,但是您可能需要在Java中对d和s进行两次转义。

The TH may be getting picked up because of bugs elsewhere in the code. 由于代码中其他地方的错误,TH可能会被占用。

Also, I can't give as good advice since you're not very clear on your project. 另外,由于您对项目的了解不多,因此我无法提供很好的建议。 However, if you're selecting punctuation to get rid of it, you might instead try selecting what you want to keep and then get rid of everything else. 但是,如果您选择要删除标点符号,则可以尝试选择要保留的标点符号,然后再删除其他所有内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM