简体   繁体   English

GZIP和UDP中使用Java的大量字符串的最快方法

[英]Fastest way to GZIP and UDP a large amount of Strings in Java

I'm implementing a logging system that needs to encode the log messages with GZIP and send them off by UDP. 我正在实现一个日志系统,它需要用GZIP对日志消息进行编码,然后通过UDP发送它们。

What I've got so far is: 到目前为止我得到的是:

Initialization: 初始化:

DatagramSocket sock = new DatagramSocket(); 
baos = new ByteArrayOutputStream();
printStream = new PrintStream(new GZIPOutputStream(baos));

This printStream is then passed out of the logger - messages will arrive through it 然后将此printStream传递出记录器 - 消息将通过它传递

Then every time a message arrives: 然后每次消息到达时:

byte[] d = baos.toByteArray();
DatagramPacket dp = new DatagramPacket(d,d.length,host,port);
sock.send(dp);

What stumps me currently is that I can't find a way to remove the data from the ByteArrayOutputStream (toByteArray() only takes a copy) and I'm afraid that recreating all three stream objects every time will be inefficient. 目前最让我感到困惑的是,我找不到从ByteArrayOutputStream中删除数据的方法(toByteArray()只获取副本)而且我担心每次重新创建所有三个流对象都是低效的。

Is there some way to remove sent data from the stream? 有没有办法从流中删除发送的数据? Or should I look in another direction entirely? 或者我应该完全朝另一个方向看?

You must create a new stream for each message; 您必须为每条消息创建一个新流; otherwise, every call to toByteArray() will send all previous messages again. 否则,每次调用toByteArray()都会再次发送所有先前的消息。

A better approach is probably to wrap the OutputStream of a TCP socket with a GZIPOutputStream : 更好的方法可能是使用GZIPOutputStream包装TCP套接字的OutputStream

printStream = new PrintStream(new GZIPOutputStream(sock.getOutputStream()));

Also don't forget to flush the PrintStream after every message or nothing will happen. 另外,不要忘记在每条消息之后刷新PrintStream ,否则什么都不会发生。

If speed is really that important, you should consider to use a DatagramChannel instead of the old (slow) steam API. 如果速度非常重要,您应该考虑使用DatagramChannel而不是旧的(慢)蒸汽API。 This should get you started: 这应该让你开始:

ByteBuffer buffer = ByteBuffer.allocate( 1000 );
ByteBufferOutputStream bufferOutput = new ByteBufferOutputStream( buffer );
GZIPOutputStream output = new GZIPOutputStream( bufferOutput );
OutputStreamWriter writer = new OutputStreamWriter( output, "UTF-8" );
writer.write( "log message\n" );
writer.close();

sock.getChannel().open(); // do this once
sock.getChannel().write( buffer ); // Send compressed data

Note: You can reuse the buffer by rewinding it but all the streams must be created once per message. 注意:您可以通过倒带来重用buffer ,但每条消息必须创建一次所有流。

It is worth checking that using GZIP will help if speed is important. 值得一提的是,如果速度很重要,使用GZIP会有所帮助。 (It will add some latency) (它会增加一些延迟)

public static void main(String... args) throws IOException {
    test("Hello World");
    test("Nov 20, 2012 4:55:11 PM Main main\n" +
            "INFO: Hello World log message");
}

private static void test(String s) throws IOException {
    byte[] bytes = s.getBytes("UTF-8");
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    GZIPOutputStream outputStream = new GZIPOutputStream(baos);
    outputStream.write(bytes);
    outputStream.close();
    byte[] bytes2 = baos.toByteArray();
    System.out.println("'" + s + "' raw.length=" + bytes.length + " gzip.length=" + bytes2.length);
}

prints 版画

'Hello World' raw.length=11 gzip.length=31
'Nov 20, 2012 4:55:11 PM Main main
INFO: Hello World log message' raw.length=63 gzip.length=80

The answers were helpful with other aspects of my problem, but for the actual question - there is a way to clear the data from a ByteArrayOutputStream. 答案对我的问题的其他方面很有帮助,但对于实际问题 - 有一种方法可以清除ByteArrayOutputStream中的数据。 It has a reset() method. 它有一个reset()方法。 It doesn't actually clear the buffer, but it resets the count property to 0 causing it to ignore any data already in the buffer. 它实际上没有清除缓冲区,但它将count属性重置为0,导致它忽略缓冲区中已有的任何数据。

Note that writing to a GZIPOutputStream after resetting the underlying ByteArrayOutputStream will cause an error, so I have not yet found a way to reuse everything. 请注意,在重置底层ByteArrayOutputStream之后写入GZIPOutputStream将导致错误,因此我还没有找到重用所有内容的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在java中存储未知数量的字符串的最快方法是什么? - What is the fastest way to store unknown amount of strings in java? 输出大量数据的最快方法是什么? - What is the fastest way to output a large amount of data? Java中对大量字符串进行排序的高效且可扩展的方法 - Efficient and scalable way to sort large amount of strings in Java 有没有“最快的方法”在 Java 中构造字符串? - Is there a "fastest way" to construct Strings in Java? 存储/读取大量字符串的好方法? - A good way to store/read a large amount of strings? 在Java中两个大(约900K)大字符串向量之间查找孤立的最快方法是什么? - What is the fastest way to find orphans between two large (size ~900K ) Vectors of Strings in Java? 将 memory 中的大量数据写入文件的最快方法是什么? - What is the fastest way to write a large amount of data from memory to a file? 在 Java 中读取大型 XML 文件的最快方法 - Fastest way to read a large XML file in Java 比较 java 中 2 个大对象的最快方法 - Fastest way to compare 2 large objects in java 在Java中执行大量字符串替换的最快方法 - Fastest way to perform a lot of strings replace in Java
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM