简体   繁体   English

使用Java Regex解析特殊字符

[英]Parsing special character using Java Regex

I have a requirement where i need to remove those special characters from a string which are not in the array list. 我有一个要求,我需要从不在数组列表中的字符串中删除那些特殊字符。 The current code removes all special character when found , 找到后,当前代码会删除所有特殊字符,

String Modified_remark = final_remark.replaceAll("[^\\x00-\\x7F]", "");

This code removes all special character from the string , But i want to retain certain items like Angstrom Symbol (Å) & Micron Symbol (μ) 该代码从字符串中删除了所有特殊字符,但是我想保留某些项,例如Angstrom Symbol(Å)和Micron Symbol(μ)

For example if i place the allowed special character in Array , i want the code to skip the replacement and if not matching then replace with "" (Empty quotes). 例如,如果我将允许的特殊字符放在Array中,我希望代码跳过替换,如果不匹配,则替换为“”(空引号)。

String[] allowedChar = {Å, μ};

To be added more when requested by User's. 根据用户要求添加更多。 Can anyone help with this logic. 任何人都可以帮忙这个逻辑。

Just add all the allowedChar s to the exception list in your regex: 只需将所有allowedChar添加到您的正则表达式的异常列表中:

final_remark.replaceAll("[^\\x00-\\x7F" + String.join("", allowedChar) + "]", "");

Demo: https://ideone.com/iQWvHI 演示: https//ideone.com/iQWvHI

Update 更新资料

As Wiktor Stribiżew rightly pointed out, the this simple code breaks if allowedChar contains some regex special characters. 作为Wiktor的Stribiżew正确地指出的那样,这个简单的代码打破,如果allowedChar包含一些正则表达式特殊字符。 Since the requirements imply allowedChar to contain only non-ACSII characters, we may add a condition on allowedChar as follows: 由于要求暗示allowedChar仅包含非ACSII字符,因此我们可以在allowedChar上添加如下条件:

String[] allowedChar = {"Å", "μ", "]"};
String allowedChars = "";
for (String ch : allowedChar)
    if (ch.matches("^[^\\x00-\\x7F]$"))
        allowedChars += ch;
String Modified_remark = final_remark.replaceAll("[^\\x00-\\x7F" + allowedChars + "]", "");

Demo: https://ideone.com/94513e 演示: https : //ideone.com/94513e

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM