简体   繁体   English

python和java之间的gzip压缩和解压缩

[英]Gzip compression and decompression between python and java

I want to decompress a string in java which was gzip compressed and encoded as base64 in python. 我想在经过gzip压缩并在python中编码为base64的java中解压缩字符串。

What I want to do is to perform a gzip compression on a string in python and I have to decompress that compressed string in java. 我想做的是在python中的字符串上执行gzip压缩,而我必须在Java中解压缩该压缩后的字符串。

First gzip compress the string 'hello'+'\\r\\n'+'world' using gzip module in python and then encode that compressed string to base64 in python. 首先gzip使用python中的gzip模块压缩字符串'hello'+'\\ r \\ n'+'world' ,然后在python中将该压缩字符串编码为base64。 The output I get for this is H4sIAM7yqVcC/8tIzcnJ5+Uqzy/KSQEAQmZWMAwAAAA= 我为此得到的输出是H4sIAM7yqVcC / 8tIzcnJ5 + Uqzy / KSQEAQmZWMAwAAAA =

Then I use the encoded compressed string from python in java to gzip decompress that string. 然后,我使用java中python中的编码压缩字符串对gzip解压缩该字符串。 For that I fisrt perform base64 decode on that string in java using DatatypeConverter.parseBase64Binary which will give a byte array and then I perform gzip decompression on that byte array using GZIPInputStream . 为此,我首先使用DatatypeConverter.parseBase64Binary在java中的该字符串上执行base64解码,这将给出一个字节数组,然后我使用GZIPInputStream在该字节数组上执行gzip解压缩。 But the decompressed output in java is shown as helloworld . 但是Java中解压缩的输出显示为helloworld

I had a '\\r\\n' in the compressed string in python but it is not shown in decompressed output. 我在python的压缩字符串中有一个'\\ r \\ n' ,但未在解压缩的输出中显示。 I think the problem here is in base64 encode and decode performed on that string. 我认为这里的问题是在该字符串上执行的base64编码和解码。 Please help me to solve this problem. 请帮我解决这个问题。

String used: 使用的字符串:

string = 'hello'+'\\r\\n'+'world' 字符串='hello'+'\\ r \\ n'+'world'

Expected output in java: Java的预期输出:

hello 你好
world 世界

Output got: 输出结果:

helloworld 你好,世界

This is the gzip compression code in python: 这是python中的gzip压缩代码:

String ='hello'+'\\r\\n'+'world' 字符串='hello'+'\\ r \\ n'+'world'

out = StringIO.StringIO() out = StringIO.StringIO()

with gzip.GzipFile(fileobj=out, mode="w") as f: 使用gzip.GzipFile(fileobj = out,mode =“ w”)作为f:

    f.write(o)

f=open('compressed_string','wb') f =打开('compressed_string','wb')

out.getvalue() out.getvalue()

f.write(base64.b64encode(out.getvalue())) f.write(base64.b64encode(out.getvalue()))

f.close() f.close()

This is the gzip decompression code in java: 这是java中的gzip解压缩代码:

BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("compressed_string"))); BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(“ compressed_string”)));;

try { 尝试{

while((nextLine=reader.readLine())!=null)

{

    compressedStr +=nextLine;                                    

}

finally
{

  reader.close();
}

} }

byte[] compressed = DatatypeConverter.parseBase64Binary(compressedStr); byte []压缩= DatatypeConverter.parseBase64Binary(compressedStr);

decomp = decompress(compressed); decomp =解压缩(压缩);

This is gzip decompression method in java: 这是java中的gzip解压缩方法:

public static String decompress(final byte[] compressed) throws IOException { 公共静态字符串decompress(最终字节[]压缩)抛出IOException {

    String outStr = "";

    if ((compressed == null) || (compressed.length == 0)) {

        return "";

    }

    if (isCompressed(compressed)) {

        GZIPInputStream gis = new GZIPInputStream(new 

ByteArrayInputStream(compressed)); ByteArrayInputStream(压缩));

        BufferedReader bufferedReader = new BufferedReader(new 

InputStreamReader(gis, "UTF-8")); InputStreamReader(gis,“ UTF-8”));

        String line;

        while ((line = bufferedReader.readLine()) != null) {

            outStr += line;

        }

    } else {

        outStr = new String(compressed);

    }

    return outStr;

}

Reads a line of text. 读取一行文本。 A line is considered to be terminated by any one of a line feed ('\\n'), a carriage return ('\\r'), or a carriage return followed immediately by a linefeed. 一行被认为由换行符('\\ n'),回车符('\\ r')或回车符后立即换行符中的任何一个终止。

Returns: 返回值:

A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached 一个字符串,其中包含行的内容,不包含任何行终止符;如果已到达流的末尾,则为null

bufferedReader.readLine() reads by line bufferedReader.readLine()按行读取

so you need to add '\\r\\n' when you append the string 因此,当您附加字符串时,需要添加“ \\ r \\ n”

outStr += line + "\r\n";

but you should use StringBuilder 但您应该使用StringBuilder

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM