[英]GZIPInputStream to String
I am attempting to convert the gzipped body of a HTTP response to plaintext.我正在尝试将 HTTP 响应的压缩正文转换为纯文本。 I've taken the byte array of this response and converted it to a ByteArrayInputStream.
我已获取此响应的字节数组并将其转换为 ByteArrayInputStream。 I've then converted this to a GZIPInputStream.
然后我将其转换为 GZIPInputStream。 I now want to read the GZIPInputStream and store the final decompressed HTTP response body as a plaintext String.
我现在想读取 GZIPInputStream 并将最终解压缩的 HTTP 响应正文存储为纯文本字符串。
This code will store the final decompressed contents in an OutputStream, but I want to store the contents as a String:此代码将最终解压缩的内容存储在 OutputStream 中,但我想将内容存储为字符串:
public static int sChunk = 8192;
ByteArrayInputStream bais = new ByteArrayInputStream(responseBytes);
GZIPInputStream gzis = new GZIPInputStream(bais);
byte[] buffer = new byte[sChunk];
int length;
while ((length = gzis.read(buffer, 0, sChunk)) != -1) {
out.write(buffer, 0, length);
}
To decode bytes from an InputStream, you can use an InputStreamReader .要从 InputStream 解码字节,您可以使用InputStreamReader 。 Then, a BufferedReader will allow you to read your stream line by line.
然后, BufferedReader将允许您逐行读取流。
Your code will look like:您的代码将如下所示:
ByteArrayInputStream bais = new ByteArrayInputStream(responseBytes);
GZIPInputStream gzis = new GZIPInputStream(bais);
InputStreamReader reader = new InputStreamReader(gzis);
BufferedReader in = new BufferedReader(reader);
String readed;
while ((readed = in.readLine()) != null) {
System.out.println(readed);
}
You should rather have obtained the response as an InputStream
instead of as byte[]
.您应该将响应作为
InputStream
而不是byte[]
。 Then you can ungzip it using GZIPInputStream
and read it as character data using InputStreamReader
and finally write it as character data into a String
using StringWriter
.然后,您可以使用
GZIPInputStream
对其进行解压缩,并使用InputStreamReader
将其作为字符数据读取,最后使用StringWriter
将其作为字符数据写入String
。
String body = null;
String charset = "UTF-8"; // You should determine it based on response header.
try (
InputStream gzippedResponse = response.getInputStream();
InputStream ungzippedResponse = new GZIPInputStream(gzippedResponse);
Reader reader = new InputStreamReader(ungzippedResponse, charset);
Writer writer = new StringWriter();
) {
char[] buffer = new char[10240];
for (int length = 0; (length = reader.read(buffer)) > 0;) {
writer.write(buffer, 0, length);
}
body = writer.toString();
}
// ...
If your final intent is to parse the response as HTML, then I strongly recommend to just use a HTML parser for this like Jsoup .如果您的最终意图是将响应解析为 HTML,那么我强烈建议您只使用 HTML 解析器,例如Jsoup 。 It's then as easy as:
然后就这么简单:
String html = Jsoup.connect("http://google.com").get().html();
Use the try-with-resources idiom (which automatically closes any resources opened in try(...) on exit from the block) to make code cleaner.使用 try-with-resources 习惯用法(它会在退出块时自动关闭在 try(...) 中打开的任何资源)使代码更清晰。
Use Apache IOUtils to convert inputStream to String using default CharSet.使用 Apache IOUtils 使用默认 CharSet 将 inputStream 转换为 String。
import org.apache.commons.io.IOUtils;
public static String gzipFileToString(File file) throws IOException {
try(GZIPInputStream gzipIn = new GZIPInputStream(new FileInputStream(file))) {
return IOUtils.toString(gzipIn);
}
}
Use Apache Commons to convert GzipInputStream to byteArray.使用 Apache Commons 将 GzipInputStream 转换为 byteArray。
import java.io.InputStream;
import java.util.zip.GZIPInputStream;
import org.apache.commons.io.IOUtils;
public static byte[] decompressContent(byte[] pByteArray) throws IOException {
GZIPInputStream gzipIn = null;
try {
gzipIn = new GZIPInputStream(new ByteArrayInputStream(pByteArray));
return IOUtils.toByteArray(gzipIn);
} finally {
if (gzipIn != null) {
gzipIn.close();
}
}
To convert byte array uncompressed content to String, do something like this :要将字节数组未压缩的内容转换为字符串,请执行以下操作:
String uncompressedContent = new String(decompressContent(inputStream));
您可以使用StringWriter写入 String
GZip wiki is a file format and a software application used for file compression and decompression. GZip wiki是一种文件格式,是一种用于文件压缩和解压的软件应用程序。 gzip is a single-file/stream lossless data compression utility, where the resulting compressed file generally has the suffix
.gz
gzip 是一种单文件/流无损数据压缩实用程序,其中生成的压缩文件通常具有后缀
.gz
String
(Plain)
➢ Bytes ➤ GZip-Data(Compress)
➦ Bytes ➥ String(Decompress)
String
(Plain)
➢ Bytes ➤ GZip-Data(Compress)
➦ Bytes ➥ String(Decompress)
String zipData = "Hi Stackoverflow and GitHub";
// String to Bytes
byte[] byteStream = zipData.getBytes();
System.out.println("String Data:"+ new String(byteStream, "UTF-8"));
// Bytes to Compressed-Bytes then to String.
byte[] gzipCompress = gzipCompress(byteStream);
String gzipCompressString = new String(gzipCompress, "UTF-8");
System.out.println("GZIP Compressed Data:"+ gzipCompressString);
// Bytes to DeCompressed-Bytes then to String.
byte[] gzipDecompress = gzipDecompress(gzipCompress);
String gzipDecompressString = new String(gzipDecompress, "UTF-8");
System.out.println("GZIP Decompressed Data:"+ gzipDecompressString);
GZip-Bytes
(Compress)
➥ File(*.gz)
➥ String(Decompress)
GZip-Bytes
(Compress)
➥ 文件(*.gz)
➥ 字符串(Decompress)
GZip Filename extension .gz and Internet media type is application/gzip
. GZip 文件扩展名.gz和互联网媒体类型是
application/gzip
。
File textFile = new File("C:/Yash/GZIP/archive.gz.txt");
File zipFile = new File("C:/Yash/GZIP/archive.gz");
org.apache.commons.io.FileUtils.writeByteArrayToFile(textFile, byteStream);
org.apache.commons.io.FileUtils.writeByteArrayToFile(zipFile, gzipCompress);
FileInputStream inStream = new FileInputStream(zipFile);
byte[] fileGZIPBytes = IOUtils.toByteArray(inStream);
byte[] gzipFileDecompress = gzipDecompress(fileGZIPBytes);
System.out.println("GZIPFILE Decompressed Data:"+ new String(gzipFileDecompress, "UTF-8"));
Following functions are used for compression and decompression.以下函数用于压缩和解压缩。
public static byte[] gzipCompress(byte[] uncompressedData) {
byte[] result = new byte[]{};
try (
ByteArrayOutputStream bos = new ByteArrayOutputStream(uncompressedData.length);
GZIPOutputStream gzipOS = new GZIPOutputStream(bos)
) {
gzipOS.write(uncompressedData);
gzipOS.close(); // You need to close it before using ByteArrayOutputStream
result = bos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
}
return result;
}
public static byte[] gzipDecompress(byte[] compressedData) {
byte[] result = new byte[]{};
try (
ByteArrayInputStream bis = new ByteArrayInputStream(compressedData);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
GZIPInputStream gzipIS = new GZIPInputStream(bis)
) {
//String gZipString= IOUtils.toString(gzipIS);
byte[] buffer = new byte[1024];
int len;
while ((len = gzipIS.read(buffer)) != -1) {
bos.write(buffer, 0, len);
}
result = bos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
}
return result;
}
import java.io.*;
import java.util.zip.*;
public class Ex1 {
public static void main(String[] args) throws Exception{
String str ;
H h1 = new H();
h1.setHcfId("PH12345658");
h1.setHcfName("PANA HEALTH ACRE FACILITY");
str = h1.toString();
System.out.println(str);
if (str == null || str.length() == 0) {
return ;
}
ByteArrayOutputStream out = new ByteArrayOutputStream(str.length());
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
out.close();
String s = out.toString() ;
System.out.println( s );
byte[] ba = out.toByteArray();
System.out.println( "---------------BREAK-------------" );
ByteArrayInputStream in = new ByteArrayInputStream(ba);
GZIPInputStream gzis = new GZIPInputStream(in);
InputStreamReader reader = new InputStreamReader(gzis);
BufferedReader pr = new BufferedReader(reader);
String readed;
while ((readed = pr.readLine()) != null) {
System.out.println(readed);
}
//Close all the streams
}
}
you can also do你也可以这样做
try (GZIPInputStream gzipIn = new GZIPInputStream(new ByteArrayInputStream(pByteArray)))
{
....
}
AutoClosable is a good thing https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html AutoClosable 是一件好事https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.