简体   繁体   中英

JAR file with non UTF-8 charset

In Java, parsing a ZIP archive using a specified charset can be done by using the ZipFile(File, Charset) constructor for instance.

JarFile (in the util package) inherits from ZipFile, but does not offer ways to use a charset other than UTF-8. I need to parse Jar files that contain strings not encoded with UTF-8. What would be the cleanest workaround to do this?

(I have thought of using reflection to modify the private field ZipFile.zc right after JarFile() constructor returns, but this solution is not robust and Oracle-specific.)

The Charset parameter is according to the documentation only used " to decode the ZIP entry name and comment ". Therefore it is totally irrelevant for you. When you read a file from a ZipFile or Jar you are getting an InputStream with is agnostic regarding the used charset.

Therefore you have to apply the correct charset when converting the byte array based InputStream to a chaaracter based reader, eg by using an InputStreamReader .

Edit: In case we are talking about the file-names in the ZIP file you should be able to create a parallel ZipFile instance on the same file. Use JarFile.getName() for reading out the jar file path.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM