简体   繁体   English

为什么我的Java序列化显示单次转换与多次转换的性能较差?

[英]Why does my Java serialization show poor performance for single vs multiple conversions?

Below is my benchmarking code. 以下是我的基准测试代码。 First I try to serialize a single object. 首先,我尝试序列化单个对象。 Then I try to do the same thing one million times. 然后,我尝试做同样的事情一百万次。 I was expecting a proportional increase in timing. 我原本期望时间会成比例增加。 Instead I get 相反,我得到了

1323048944117 1323048944131
1323048944117 1323048944210

14ms for one object
93ms for 1mil objects

What is happening here? 这是怎么回事 I'm most worried about the time for single object conversion which I'm hoping to get to the submillisecond level. 我最担心的是单对象转换的时间,我希望能够达到亚毫秒级的水平。

import java.io.ByteArrayOutputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;

class Pojo implements Serializable {
    String blah = "11111111111111111111aaaaaaaaaaaaaaaaaaaa";
}

public class SerializeTest {
    public static void main(String[] args) {
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        ObjectOutputStream out = null;
        Pojo pojo = new Pojo();
        long start = 0;
        long end1 = 0;
        try {
            out = new ObjectOutputStream(bos);
            start =  System.currentTimeMillis();
            out.writeObject(pojo);
            end1 = System.currentTimeMillis();
            for (int i=0;i<1000000;i++) {
                out.writeObject(pojo);
            }

        } catch (Exception ex) {

        }
        long end2 = System.currentTimeMillis();
        System.out.println(start + " " + end1);
        System.out.println(start + " " + end2);

    }
}

Try to run the two in reverse. 尝试反向运行两者。 First the one million then single and see what happens. 首先是一百万,然后是单身,看看会发生什么。

Ultimately @duffmo is right that your results are 98% about the problems with doing micro testing in Java than it is about serialization. 最终,@ duffmo正确的是,与在序列化方面相比,在Java中进行微测试的问题所产生的结果要占98%。 He posted a good link to read . 他张贴了一个很好的阅读链接

Another factor that is confusing your test is that serialization is smart. 导致测试混乱的另一个因素是序列化很聪明。 Writing the first object writes all of the class and field information and the data. 写入第一个对象将写入所有类和字段信息以及数据。 The 2nd to the nth write of the same object instance is only writing a reference to the first object. 同一对象实例的第二次写入到第n次写入仅写入对第一个对象的引用。 The first object looks to be ~120 bytes and each remaining one is only 5 bytes. 第一个对象看起来约为120个字节,其余的每个对象只有5个字节。 It's also going to be tons faster because it does a lot less. 它还会更快,因为它的工作量要少得多。

My test did the 1st object in 8ms, the next 1000 objects in 5ms, and then a million objects in 2000ms. 我的测试在8毫秒内完成了第一个对象,在5毫秒内完成了接下来的1000个对象,然后在2000毫秒内完成了100万个对象。 This shows you mostly how fast serialization is. 这主要显示了序列化的速度。 It also shows you that Java does some magic real-time optimizations that cause drastically non-linear speed graphs. 它还向您展示Java进行了一些神奇的实时优化,从而导致了急剧的非线性速度图。

If you are trying to do some serialization speed testing I would make your Pojo class generate some random numbers (or a random string) and I would compare the serialization of 1000 objects to a million to have a better comparison of per-object serialization times. 如果您要进行一些序列化速度测试,我将使您的Pojo类生成一些随机数(或随机字符串),并且将1000个对象的序列化与一百万个对象进行比较,以更好地比较每个对象的序列化时间。 Either that or start the timing after you have written 1000 objects or something. 您可以写入1000个对象或其他东西,然后开始计时。 For example, writing a different object produces a file that is ~10 times larger but only took 5600ms. 例如,编写一个不同的对象将产生一个大约大10倍但仅花费5600毫秒的文件。

Just FYI, your second time should actually be 79ms 仅供参考,您的第二次实际上应该是79ms

The code should read: 该代码应显示为:

System.out.println(end1+ " " + end2);

Create a new 'pojo' object to send each time. 创建一个新的“ pojo”对象以每次发送。 You might also want to have different values for 'blah' in each instance of pojo. 您可能还希望在pojo的每个实例中为'blah'设置不同的值。 eg "" + Math.random(). 例如“” + Math.random()。

I think you'll see the performance for the one million be more like you expect. 我想您会看到100万的性能更加理想。 IIRC when you serialize the same object again and again to the same stream, the Java libraries will cache it in a dictionary and only send a 'reference' back to the originally sent object. 当您一次又一次地将同一对象序列化到相同的流时,IIRC将被Java库缓存在字典中,并且仅将“引用”发送回原始发送的对象。 So that's why you're seeing such a crazy fast number for the million pojo stream. 这就是为什么您会看到百万pojo流如此疯狂的快速数字的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM