简体   繁体   中英

Unicode characters turning to question marks

In my production level application i am working on some issues related to unicode characters like chinese/japanese strings.

My application has a starting program and a configuration file which sets up all the parameters which can be used by running jvm. This configuration file is passed as part of the command line args.

The usecase in question being, i need to pass a config file (content in name-value pair format). The name of this config file has unicode characters (ie Chinese) in it. When i am running the application from the command prompt. I pass the complete config file path, where the name of the file when i copy looks like "????.conf". When i receive the file in my code its still in the format "some/path/and/????.conf". Eventually when i run this path through a file exists check if fails, ie new File(path).isFile();

So i created a small program to test this scenario, the job of the program was to take a file path as part of the command line argument, print it and read the contents of that file. Here before running, it looks similar to above ie "some/path/and/????.conf", when it runs and prints the location, its still the same ie "some/path/and/????.conf". But when i debug it, i am able to see the right chinese characters plus its able to read the file and its contents.

So i am not sure what is missing/different from my main application. Few things i have checked and tried out are, 1. Changed the encoding type of the command prompt to UTF-8 via command chcp 650001. 2. Set the java property "-Dfile.encoding=UTF-8".

However that too hasn't helped. Operating System is windows 7, java version is 1.7.0.45.

Any pointers with respect to where to look and why a similar code with my small program works and does not with the main application.

====== One correction, the file being passed to the java program as part of the command prompt is of xml format. And has the encoding type set to UTF-8, ie via "".

So the same file is passed to both the programs, in case of the simple file read class its working, while its not with the main application. Things different with the main application is, in addition to this xml file, there are other parameters passed along as well.

Thanks,

Vicky

Check the encoding type of the .conf file. It should have been saved with UTF-8 encoding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM