简体   繁体   English

使用RegEx从Java字符串中删除除-和_以外的所有标点符号

[英]Removing all punctuation except - and _ from a java string using RegEx

I am trying to replace all punctuation except the - and _ using a method I found here, but I can only get it to work on " using the exact code as posted which used a negative lookahead: 我正在尝试使用在这里找到的方法来替换除-和_以外的所有标点符号,但是我只能使用所发布的确切代码使用负前瞻来使其在“上工作:

(?!")\\p{punct}

//Java example:

String string = ".\"'";
System.out.println(string.replaceAll("(?!\")\\p{Punct}", ""));

I tried: 我试过了:

name = name.replaceAll("(?!_-)\\p{Punct}", ""); // which just replaces all punctuation.

name = name.replaceAll("(?!\_-)\\p{Punct}", ""); // which gives an error.

Thanks. 谢谢。

Use a character class subtraction (and add a + quantifier to match chunks of 1 or more punctuation chars): 使用字符类减法 (并添加+量词以匹配1个或多个标点字符的块):

name = name.replaceAll("[\\p{Punct}&&[^_-]]+", "");

See the Java demo . 请参阅Java演示

The [\\\\p{Punct}&&[^_-]]+ means match any char from \\p{Punct} class except _ and - . [\\\\p{Punct}&&[^_-]]+表示匹配\\p{Punct}类中的任何字符,除了_-

The construction you found can also be used, but you'd need to put the - and _ into a character class, and use .replaceAll("(?![_-])\\\\p{Punct}", "") , or .replaceAll("(?:(?![_-])\\\\p{Punct})+", "") . 您发现的结构也可以使用,但是您需要将-_放入字符类,并使用.replaceAll("(?![_-])\\\\p{Punct}", "").replaceAll("(?:(?![_-])\\\\p{Punct})+", "")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM