简体   繁体   English

Java,正则表达式,去除不需要的字符[后跟,前导,之间]

[英]Java, Regex, strip unwanted characters [trailing, leading, between]

i need help for an regular expression to strip unwanted characters from an String (in Java). 我需要一个正则表达式帮助,以从字符串中剥离不需要的字符(在Java中)。 I solved this issue with 4 regular expression following each other. 我通过互相跟随4个正则表达式解决了这个问题。 The replace will be called many times [peeks: 50+ times/sec] it and decreases performance. 替换将被多次调用(偷看:50次以上/秒),并降低性能。 But i think it sure possible with an single expression, so the performance will be increased a little. 但我认为使用单个表达式肯定有可能,因此性能会有所提高。

The TestString is TestString是

"   ! ... my-Cruc i@l_\\/Disp lay.Na#m3 ?;()!    "

The tasks i like to perform with regex 我喜欢用正则表达式执行的任务

  • Remove all leading non-alpha charcters – [Beginning of String] 删除所有主要的非字母字符– [字符串的开头]
  • Remove all trailing non-alphanumeric characters – [End of String] 删除所有结尾的非字母数字字符– [字符串结尾]
  • Remove all non-alphanumeric characters(except [_-.]) between 删除之间的所有非字母数字字符([_-。]除外)

So the result will be 因此结果将是

my-Cruil_Display.Nam3

The Problem is the switch between, the built-in patterns Alnum and alpha, depending on position in string (beginning, end) and the exception characters [_-.] between them. 问题是如何在内置模式Alnum和alpha之间进行切换,具体取决于字符串中的位置(开头,结尾)以及它们之间的异常字符[_-。]。

I tried this many times in the last few days, but i do not get it to work. 在过去的几天里,我尝试了很多次,但是我没有使它起作用。 Removing leading non-alpha characters is working with regex 删除前导非字母字符与正则表达式一起使用

^([^\\p{Alpha}]+)?

But if i append the „between“ it doesnt work longer anything 但是,如果我附加“之间”,它将不再起作用

Removing trailing non-alpha charcter with regex 使用正则表达式删除尾随的非alpha字符

([^\\p{Alnum}]+$) 

is working , but not im combination with all other regex 正在工作,但不能与所有其他正则表达式结合使用

One of the last tries are 最后的尝试之一是

(^[^\\p{Alpha}]+)?[^\\p{Alnum}\\._-]+([^\\p{Alnum}]+$)

Can anyone help to get this working 谁能帮忙

You may use 您可以使用

^\P{Alpha}+|\P{Alnum}+$|[^\p{Alnum}_.-]

Java: Java的:

s = s.replaceAll("^\\P{Alpha}+|\\P{Alnum}+$|[^\\p{Alnum}_.-]", "");

Or, to make it Unicode aware, add the (?U) flag: 或者,要使其能够识别Unicode,请添加(?U)标志:

s = s.replaceAll("(?U)^\\P{Alpha}+|\\P{Alnum}+$|[^\\p{Alnum}_.-]", "");

Details 细节

  • ^\\P{Alpha}+ - any 1 or more chars other than alphabetic chars at the start of the string ^\\P{Alpha}+ -字符串开头的字母字符以外的1个或多个字符
  • | - or - 要么
  • \\P{Alnum}+$ - any 1 or more chars other than alphanumeric chars at the end of the string \\P{Alnum}+$ -字符串末尾除字母数字字符外的任何1个或多个字符
  • | - or - 要么
  • [^\\p{Alnum}_.-] - any char other than alphanumeric, _ , . [^\\p{Alnum}_.-] -除字母数字_之外的任何字符. and - chars anywhere in the string -字符串中任何地方的字符

See the regex demo . 参见regex演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Java正则表达式使用空格去除前导和尾随零 - Java Regex to Strip Leading & Trailing Zeros with Spaces 从 Java 字符串中去除前导和尾随空格 - Strip Leading and Trailing Spaces From Java String 去除前导和尾随竖线| 从Java中的字符串 - Strip off leading and trailing vertical bar | from string in Java Java正则表达式用下划线替换字符串中的所有特殊字符,同时考虑删除前导、尾随、多个下划线 - Java regex to replace all special characters in a String with an underscore also considering removing leading,trailing,multiple underscores 正则表达式将匹配6个字符,只允许数字,前导和尾随空格 - Regex that will match 6 characters that only allows digits, leading, and trailing spaces 从Java中的字符串中删除前导的非数字字符 - Remove leading trailing non numeric characters from a string in Java Java Regex用于删除分隔字符串中的所有前导零 - Java Regex to strip all leading zeros in a delimited String 正则表达式允许; 至少5位数字,并用JAVA修饰前导/后跟分号 - Regex allow ; and at least 5 digit numbers and trim leading/trailing semicolon in JAVA Java正则表达式-拆分带前导特殊字符的字符串 - Java regex - split string with leading special characters 通过Java中的正则表达式从字符串中删除不需要的字符 - Remove unwanted characters from string by regex in Java
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM