[英]how to open java program generated zip file using UTF-8 encoding
我们的产品有一个出口function,它使用ZipOutputStream
到zip一个目录; 但是,当您尝试 zip 包含带有中文或日文字符的文件名的目录时,导出无法正常工作。 由于某种原因,压缩文件中的新文件名称不同。 这是我们的邮政编码示例:
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(zipFileName));
out.setEncoding("UTF-8");
//program to add directory to zip
//program add/create file to zip
out.close();
我的导入算法,同样内置在 Java 中,可以正确导入压缩文件,即使文件/目录名称中包含中文/日文字符。
Zipfile zipfile = new ZipFile(zipPath, "UTF-8");
Enumeration e = zipFile.getEntries();
while (e.hasMoreElements()) {
entry = (ZipEntry) e.nextElement();
String name = entry.getName();
....
Is the zip software's program having trouble unzipping the UTF-8 encoded files, or is there something special needed to create a zip file that can be easily used by existing software using utf-8 encoding??
我写了一个示例程序:
package ZipFile;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.tools.zip.ZipEntry;
import org.apache.tools.zip.ZipOutputStream;
public class ZipFolder{
public static void main(String[] a) throws Exception
{
String srcFolder = "D:/9.4_work/openscript_repo/中文124.All/中文";
String destZipFile = "D:/Eclipse_Projects/OpenScriptDebuggingProject/src/ZipFile/demo.zip";
zipFolder(srcFolder, destZipFile);
}
static public void zipFolder(String srcFolder, String destZipFile) throws Exception
{
ZipOutputStream zip = null;
FileOutputStream fileWriter = null;
fileWriter = new FileOutputStream(destZipFile);
zip = new ZipOutputStream(fileWriter);
zip.setEncoding("UTF-8");
// using GBK encoding, the chinese name can be correctly displayed when unzip
// zip.setEncoding("GBK");
addFolderToZip("", srcFolder, zip);
zip.flush();
zip.close();
}
static private void addFileToZip(String path, String srcFile, ZipOutputStream zip) throws Exception
{
File folder = new File(srcFile);
if (folder.isDirectory()) {
addFolderToZip(path, srcFile, zip);
}
else {
byte[] buf = new byte[1024];
int len;
FileInputStream in = new FileInputStream(srcFile);
zip.putNextEntry(new ZipEntry(path + "/" + folder.getName()));
while ((len = in.read(buf)) > 0) {
zip.write(buf, 0, len);
}
}
}
static private void addFolderToZip(String path, String srcFolder, ZipOutputStream zip) throws Exception
{
File folder = new File(srcFolder);
for (String fileName : folder.list()) {
if (path.equals("")) {
addFileToZip(folder.getName(), srcFolder + "/" + fileName, zip);
}
else {
addFileToZip(path + "/" + folder.getName(), srcFolder + "/" + fileName, zip);
}
}
}
}
以下实用程序 class 允许您使用 GZIP 压缩算法压缩和解压缩字符串。 例如,如果您想在数据库中保存长字符串,这会很有用。
import java.io.ByteArrayOutputStream;
import java.io.ByteArrayInputStream;
import java.util.zip.GZIPOutputStream;
import java.util.zip.GZIPInputStream;
public class GzipStringUtil {
public static byte[] compressString(String uncompressedString) throws IllegalArgumentException, IllegalStateException {
if (uncompressedString == null) {
throw new IllegalArgumentException("The uncompressed string specified was null.");
}
try {
byte[] utfEncodedBytes = uncompressedString.getBytes("UTF-8");
ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzipOutputStream = new GZIPOutputStream(baos);
gzipOutputStream.write(utfEncodedBytes);
gzipOutputStream.finish();
gzipOutputStream.close();
return baos.toByteArray();
}
catch (Exception e) {
throw new IllegalStateException("GZIP compression failed: " + e, e);
}
}
public static String uncompressString(byte[] compressedString) throws IllegalArgumentException, IllegalStateException {
if (compressedString == null) {
throw new IllegalArgumentException("The compressed string specified was null.");
}
try {
ByteArrayInputStream bais = new ByteArrayInputStream(compressedString);
GZIPInputStream gzipInputStream = new GZIPInputStream(bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
for (int value = 0; value != -1;) {
value = gzipInputStream.read();
if (value != -1) {
baos.write(value);
}
}
gzipInputStream.close();
baos.close();
return new String(baos.toByteArray(), "UTF-8");
}
catch (Exception e) {
throw new IllegalStateException("GZIP uncompression failed: " + e, e);
}
}
}
这是一个 TestCase,它提供了上面 class 的示例用法:
public class GzipStringUtilTest extends TestCase {
public void testGzipStringUtil() {
String input = "This is a test. This is a test. This is a test. This is a test. This is a test.";
System.out.println("Input: [" + input + "]");
byte[] compressed = GzipStringUtil.compressString(input);
System.out.println("Compressed: " + Arrays.toString(compressed));
System.out.println("-> Compressed input string of length " + input.length() + " to " + compressed.length + " bytes");
String uncompressed = GzipStringUtil.uncompressString(compressed);
System.out.println("Uncompressed: [" + uncompressed + "]");
assertEquals("The uncompressed string [" + uncompressed + "] unexpectedly does not match the input string [" + input + "]", input, uncompressed);
System.out.println("The input was compressed and uncompressed successfully, and the input matches uncompressed output.");
}
}
这里的最佳答案可能会回答您的问题; 不幸的是,它似乎暗示 Zip 格式实际上不允许创建 Zip 文件,该文件将在任何计算机上正确显示文件名:
https://superuser.com/questions/60379/linux-zip-tgz-filenames-encoding-problem
我希望当您将编码设置为 GBK 时它会起作用,因为这是您系统的默认编码,因此 7zip 将其用于它打开的所有 zip 文件。
这表明rar
和7z
格式有更好的支持。
我在 Java 的 zip 中找到了一个专门关于 UTF-8 的博客条目。 它表明 ZIP 规范的更新版本,当前版本的 Java 可能不会创建,但 Java 7 会创建。 我不知道 Apache 类是否也使用这个。
http://blogs.oracle.com/xuemingshen/entry/non_utf_8_encoding_in
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.