I am parsing CSV file, there I encounter special characters like á
.
String line = scanner.nextLine();
Can any one help me to remove á
and corrupted characters from the string line. I tried the following
line.replaceAll("[^a-zA-Z0-9]+","");
but it replacing :
, /
[
]
symbols.
inputStream = filePart.getInputStream();
Scanner scanner = new Scanner(inputStream);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
System.out.println("Line : " + line.trim());
String[] fields = line.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", -1);
for (int i = fields.length - 1; i >= 0; i--) {
System.out.println(i + " " + fields[i].replaceAll("[á]", ""));
}
Why not just replace a positive character class containing the accented character(s):
String input = "hablá";
input = input.replaceAll("[á]", "");
System.out.println(input);
Or
input = input.replaceAll("[\\u00e1]", "");
Output:
habl
Add the characters you don't want stripped out to your regex pattern match.
eg
[^a-zA-Z0-9$\/\]\[\:\,]+
Will match az, AZ, 0-9, /, \\, ], [, :, ,, Don't forget to escape special characters in the pattern with a \\
Also you can use https://regex101.com/ to check the validity of any regex you create.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.