简体   繁体   中英

Java - Why is this implementation of a binary heap faster than the other?

After reading a bit about heaps/priority queues, I recently made my own implementation of one. Afterwards I decided to compare the performance of my implementation to that of one which I found in a book, and the results are a bit confusing to me. It appears that there is a vast performance difference between the insert methods of the two implementations.

I used this code to test both heaps:

Random rnd = new Random();
long startTime = System.currentTimeMillis();
for(int i = 0; i < 1_000_000_0; i++) heap.insert(rnd.nextInt(1000));
System.out.println(System.currentTimeMillis() - startTime);

When I run this with my heap implementation, I get a result of around 600ms. When I run it with the book's implementation I get around 1900ms. How can the difference possibly be this big? Surely there must be something wrong with my implementation.

My implementation:

public class Heap<T extends Comparable<? super T>> {

    private T[] array = (T[])new Comparable[10];
    private int size = 0;

    public void insert(T data) {
        if(size+1 > array.length) expandArray();

        array[size++] = data;
        int pos = size-1;
        T temp;

        while(pos != 0 && array[pos].compareTo(array[pos/2]) < 0) {
            temp = array[pos/2];
            array[pos/2] = array[pos];
            array[pos] = temp;
            pos /= 2;
        }
    }

    private void expandArray() {
        T[] newArray = (T[])new Comparable[array.length*2];

        for(int i = 0; i < array.length; i++)
            newArray[i] = array[i];

        array = newArray;
    }
}

The book's implementation:

public class BooksHeap<AnyType extends Comparable<? super AnyType>>
{
    private static final int DEFAULT_CAPACITY = 10;

    private int currentSize;
    private AnyType [ ] array;

    public BinaryHeap( )
    {
        this( DEFAULT_CAPACITY );
    }

    public BinaryHeap( int capacity )
    {
        currentSize = 0;
        array = (AnyType[]) new Comparable[ capacity + 1 ];
    }

    public void insert( AnyType x )
    {
        if( currentSize == array.length - 1 )
            enlargeArray( array.length * 2 + 1 );

        int hole = ++currentSize;
        for( array[ 0 ] = x; x.compareTo( array[ hole / 2 ] ) < 0; hole /= 2 )
            array[ hole ] = array[ hole / 2 ];
        array[ hole ] = x;
    }


    private void enlargeArray( int newSize )
    {
            AnyType [] old = array;
            array = (AnyType []) new Comparable[ newSize ];
            for( int i = 0; i < old.length; i++ )
                array[ i ] = old[ i ];        
    }
}

Edit: The book is "Data Structures and Algorithm Analysis in Java" by Mark Allen Weiss. Third edition. ISBN: 0-273-75211-1.

Here, your code measured with JMH:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@OperationsPerInvocation(Measure.SIZE)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@State(Scope.Thread)
@Fork(1)
public class Measure
{
  static final int SIZE = 4_000_000;
  private Random rnd;

  @Setup public void setup() {
    rnd  = new Random();
  }

  @Benchmark public Object heap() {
    Heap<Integer> heap = new Heap<>();
    for (int i = 0; i < SIZE; i++) heap.insert(rnd.nextInt());
    return heap;
  }

  @Benchmark public Object booksHeap() {
    BooksHeap<Integer> heap = new BooksHeap<>();
    for (int i = 0; i < SIZE; i++) heap.insert(rnd.nextInt());
    return heap;
  }

  public static class Heap<T extends Comparable<? super T>> {

    private T[] array = (T[])new Comparable[10];
    private int size = 0;

    public void insert(T data) {
      if(size+1 > array.length) expandArray();

      array[size++] = data;
      int pos = size-1;
      T temp;

      while(pos != 0 && array[pos].compareTo(array[pos/2]) < 0) {
        temp = array[pos/2];
        array[pos/2] = array[pos];
        array[pos] = temp;
        pos /= 2;
      }
    }

    private void expandArray() {
      T[] newArray = (T[])new Comparable[array.length*2];
      for (int i = 0; i < array.length; i++)
        newArray[i] = array[i];
      array = newArray;
    }
  }

  public static class BooksHeap<AnyType extends Comparable<? super AnyType>>
  {
    private static final int DEFAULT_CAPACITY = 10;

    private int currentSize;
    private AnyType [ ] array;

    public BooksHeap()
    {
      this( DEFAULT_CAPACITY );
    }

    public BooksHeap( int capacity )
    {
      currentSize = 0;
      array = (AnyType[]) new Comparable[ capacity + 1 ];
    }

    public void insert( AnyType x )
    {
      if( currentSize == array.length - 1 )
        enlargeArray( array.length * 2 + 1 );

      int hole = ++currentSize;
      for( array[ 0 ] = x; x.compareTo( array[ hole / 2 ] ) < 0; hole /= 2 )
        array[ hole ] = array[ hole / 2 ];
      array[ hole ] = x;
    }


    private void enlargeArray( int newSize )
    {
      AnyType [] old = array;
      array = (AnyType []) new Comparable[ newSize ];
      for( int i = 0; i < old.length; i++ )
        array[ i ] = old[ i ];
    }
  }
}

And the results:

Benchmark          Mode  Cnt   Score    Error  Units
Measure.booksHeap  avgt    5  62,712 ± 23,633  ns/op
Measure.heap       avgt    5  62,784 ± 44,228  ns/op

They are exactly the same.

Moral of the exercise: don't think you can just write a loop and call it a benchmark . Measuring anything meaningful within a complex, self-optimizing runtime like HotSpot is an incredibly difficult challenge, best left to an expert benchmark tool like JMH.

As a side note, you could shave some 20% off your times (in both implementations) if you use System.arraycopy instead of the manual loop. Embarassingly, this wasn't my idea—IntelliJ IDEA's automatic inspection suggested that, and converted the code on its own :)

Taking the testing of implementations part of this question, how you are testing these implementations can explain a lot of any difference, consider this example. When I place your Heap in a class called OPHeap and the book's heap in a class called BookHeap and then test in this order:

import java.util.Random;

public class Test {
    public static void main(String ...args) {
        {
            Random rnd = new Random();
            BookHeap<Integer> heap = new BookHeap<Integer>();
            long startTime = System.currentTimeMillis();
            for(int i = 0; i < 1_000_000_0; i++) heap.insert(rnd.nextInt(1000));
            System.out.println("Book's Heap:" + (System.currentTimeMillis() - startTime));
        }
        {
            Random rnd = new Random();
            OPHeap<Integer> heap = new OPHeap<Integer>();
            long startTime = System.currentTimeMillis();
            for(int i = 0; i < 1_000_000_0; i++) heap.insert(rnd.nextInt(1000));
            System.out.println("  OP's Heap:" + (System.currentTimeMillis() - startTime));
        }
    }
}

I get this output:

Book's Heap:1924
  OP's Heap:1171

However when I swap the order of the tests I get this output:

  OP's Heap:1867
Book's Heap:1515

This is called "Warm-up" and you can learn a lot of ways to deal with it from this article . Also anytime you are using Random in a test you should define a seed value, so your "pseudo random" results are predictable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM