简体   繁体   中英

The fastest way to add elements in set collection

I have a task to add in treeset > 10000000 elements that are sequence.

If I use

for (long index = 0; index < 10000000; index++)
    {
        result.add(index);
    }

It takes 8083 ms. Is there any solution to increase the performance of this task?

https://github.com/cyberterror/TestRanges

PS The fastest way at the moment is: List<Integer> range = IntStream.range(0, 10000000).boxed().collect(Collectors.toList()); with result of ~ 370 ms

You already add your items in the correct order, the TreeSet will sort itself after each addded item which is complex, the LinkedHashSet simply keeps the insertion order.

So if you actually need a Set go for the LinkedHashSet implementation like here:

Set<Long> result = new LinkedHashSet<Long>();
for (Long index = 0L; index != 10000000L;) { //Avoid autoboxing
    result.add(index++);
}

Read here: https://dzone.com/articles/hashset-vs-treeset-vs

Do you really need collection? For wich purpose, if so? Actually, using plain array you may improve performance drustically.

   long [] ar = new long[10000000];
    for (int i = 0; i < 10000000; i++) {
        ar[i] = (long )i;
    }

...

BUILD SUCCESS
------------------------------------------------------------------------
Total time: 0.553 s

UPD: Actually, it is possible to perform most operations on array using Arrays utility

long [] ar = new long[10000000];
for (int i = 0; i < 10000000; i++) {
    ar[i] = (long )i;
}

long[] copyOfRange = Arrays.copyOfRange(ar, 50000, 1000000);

...

BUILD SUCCESS
------------------------------------------------------------------------
Total time: 0.521 s

Try HPPC: High Performance Primitive Collections for Java

License: Apache License 2.0

<dependency>
  <groupId>com.carrotsearch</groupId>
  <artifactId>hppc</artifactId>
  <version>0.7.1</version>
</dependency>

LongHashSet executes in 1190ms:

LongSet result = new LongHashSet();
for (Long index = 0L; index < 10000000L;) {
  result.add(index++);
}

LongScatterSet executes in 850ms:

LongSet result = new LongScatterSet();
for (Long index = 0L; index < 10000000L;) {
  result.add(index++);
}

TreeSet is a balanced Red-Black tree. It takes so much time as the tree is being balanced every time you add a new item. Try to add the items in a different order; actually in this order:

  • 5 000 000 - middle of 0 and 10 000 000 (your set size is 10 000 000)
  • 2 500 000 - middle of 0 and 5 000 000
  • 7 500 000 - middle of 5 000 000 and 10 000 000
  • number in the middle of 0 and 2 500 000
  • number in the middle of 2 500 000 and 5 000 000
  • number in the middle of 5 000 000 and 7 500 000
  • number in the middle of 7 500 000 and 10 000 000
  • etc.

This way you will keep your tree always balanced and no additional operations will be performed (to balance the tree). Just make sure that your algorithm to count which number to add next is not too complex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM