I use Apache POI to read from an Excel file to get the paths for docx, doc, xls, and xlsx files, decrypt the file content and build a new path to read the data back.
The problem now is when path has french character, like following:
/Valérie/CASES.doxcs
is = new FileInputStream(path);
This line will have the following exception:
(No such file or directory)
at java.io.FileInputStream.open(Native Method)
It works well for other path, is that mean Apache POI does not support non-English character or is something else wrong? Anyway to fix this?
As this is an operating system matter, you could convert paths:
static String toFileName(String name) {
return java.text.Normalizer.normalize(name, Form.NFKD)
.replaceAll("\\P{ASCII}", ""); //.replaceAll("[\"/\\]", "_");
}
The above would convert é
to e
and so on, by splitting an accented letter into a basic letter plus accents. There might be better transliterations. And consider Cyrillic and other scripts.
A nicer solution would be to move to a Linux system with UTF-8. You might still want to normalize accent usage to one unique form, say the shortest char sequence:
static String toFileName(String name) {
return java.text.Normalizer.normalize(name, Form.NFKC);
}
How can I open files containing accents in Java? . tried everything on this link. For most situation, the configuration in Eclipse window->preference->general->workspace set to utf-8, and project-> running as configuration vm Arguments: Dfile.encoding=UTF-8
should already solve the problem.
But if you JDK is not SUN and you are in linux system. You'd better echo $LANG make sure it's UTF-8 and then compile and run the java src code through linux command line.Problem solved. Links for java code run in linux: http://www.sergiy.ca/how-to-compile-and-launch-java-code-from-command-line/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.