[英]Fastest way to GZIP and UDP a large amount of Strings in Java
I'm implementing a logging system that needs to encode the log messages with GZIP and send them off by UDP. 我正在实现一个日志系统,它需要用GZIP对日志消息进行编码,然后通过UDP发送它们。
What I've got so far is: 到目前为止我得到的是:
Initialization: 初始化:
DatagramSocket sock = new DatagramSocket();
baos = new ByteArrayOutputStream();
printStream = new PrintStream(new GZIPOutputStream(baos));
This printStream is then passed out of the logger - messages will arrive through it 然后将此printStream传递出记录器 - 消息将通过它传递
Then every time a message arrives: 然后每次消息到达时:
byte[] d = baos.toByteArray();
DatagramPacket dp = new DatagramPacket(d,d.length,host,port);
sock.send(dp);
What stumps me currently is that I can't find a way to remove the data from the ByteArrayOutputStream (toByteArray() only takes a copy) and I'm afraid that recreating all three stream objects every time will be inefficient. 目前最让我感到困惑的是,我找不到从ByteArrayOutputStream中删除数据的方法(toByteArray()只获取副本)而且我担心每次重新创建所有三个流对象都是低效的。
Is there some way to remove sent data from the stream? 有没有办法从流中删除发送的数据? Or should I look in another direction entirely?
或者我应该完全朝另一个方向看?
You must create a new stream for each message; 您必须为每条消息创建一个新流; otherwise, every call to
toByteArray()
will send all previous messages again. 否则,每次调用
toByteArray()
都会再次发送所有先前的消息。
A better approach is probably to wrap the OutputStream
of a TCP socket with a GZIPOutputStream
: 更好的方法可能是使用
GZIPOutputStream
包装TCP套接字的OutputStream
:
printStream = new PrintStream(new GZIPOutputStream(sock.getOutputStream()));
Also don't forget to flush the PrintStream
after every message or nothing will happen. 另外,不要忘记在每条消息之后刷新
PrintStream
,否则什么都不会发生。
If speed is really that important, you should consider to use a DatagramChannel
instead of the old (slow) steam API. 如果速度非常重要,您应该考虑使用
DatagramChannel
而不是旧的(慢)蒸汽API。 This should get you started: 这应该让你开始:
ByteBuffer buffer = ByteBuffer.allocate( 1000 );
ByteBufferOutputStream bufferOutput = new ByteBufferOutputStream( buffer );
GZIPOutputStream output = new GZIPOutputStream( bufferOutput );
OutputStreamWriter writer = new OutputStreamWriter( output, "UTF-8" );
writer.write( "log message\n" );
writer.close();
sock.getChannel().open(); // do this once
sock.getChannel().write( buffer ); // Send compressed data
Note: You can reuse the buffer
by rewinding it but all the streams must be created once per message. 注意:您可以通过倒带来重用
buffer
,但每条消息必须创建一次所有流。
It is worth checking that using GZIP will help if speed is important. 值得一提的是,如果速度很重要,使用GZIP会有所帮助。 (It will add some latency)
(它会增加一些延迟)
public static void main(String... args) throws IOException {
test("Hello World");
test("Nov 20, 2012 4:55:11 PM Main main\n" +
"INFO: Hello World log message");
}
private static void test(String s) throws IOException {
byte[] bytes = s.getBytes("UTF-8");
ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream outputStream = new GZIPOutputStream(baos);
outputStream.write(bytes);
outputStream.close();
byte[] bytes2 = baos.toByteArray();
System.out.println("'" + s + "' raw.length=" + bytes.length + " gzip.length=" + bytes2.length);
}
prints 版画
'Hello World' raw.length=11 gzip.length=31
'Nov 20, 2012 4:55:11 PM Main main
INFO: Hello World log message' raw.length=63 gzip.length=80
The answers were helpful with other aspects of my problem, but for the actual question - there is a way to clear the data from a ByteArrayOutputStream. 答案对我的问题的其他方面很有帮助,但对于实际问题 - 有一种方法可以清除ByteArrayOutputStream中的数据。 It has a reset() method.
它有一个reset()方法。 It doesn't actually clear the buffer, but it resets the count property to 0 causing it to ignore any data already in the buffer.
它实际上没有清除缓冲区,但它将count属性重置为0,导致它忽略缓冲区中已有的任何数据。
Note that writing to a GZIPOutputStream after resetting the underlying ByteArrayOutputStream will cause an error, so I have not yet found a way to reuse everything. 请注意,在重置底层ByteArrayOutputStream之后写入GZIPOutputStream将导致错误,因此我还没有找到重用所有内容的方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.