简体   繁体   中英

Replace all characters in string beside a-z 0-9 and ,

嘿,我想清理一个字符串,只允许它具有az AZ(还有其他语言,不仅是英语),而且,我尝试执行ReplaceAll([^az 0-9,])但它正在删除其他语言。有人告诉我如何才能仅对特殊字符进行消毒,并且也不会从中删除表情符号?

You could try getting the az and 0-9 characters' ASCII code, and if the current character is not one of them, do what you wish. On how to get the ascii value of a character, refer here .

EDIT: the idea is that az and 0-9 the characters are next to each other. So just write a simple function that returns a boolean whether your current character is one of these, and if not, replace. For this though, you will have to replace one by one.

I've tested this regular expression and AFAIK it works...

String result = yourString.replaceAll("[^a-zA-Z0-9]", "");

It replaces any character that isn't in the set az, AZ, or 0-9 with nothing.

In java you can do

yourString.replaceAll("[^\\p{L}\\p{Nd}]+", "");

The regular expression [^\\p{L}\\p{Nd}]+ match all characters that are no a unicode letter or a decimal number.

If you need only characters (not numbers) you can use the regular expression [^\\\\p{L}]+ as follow:

yourString.replaceAll("[^\\p{L}]+", "");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM