简体   繁体   中英

time complexity for java arrayList remove(element)

I was trying to graph the Time Complexity of ArrayList's remove(element) method. My understanding is that it should be O(N), however, its giving me a O(1). Can anyone point out what i did wrong here?? Thank you in advance.

   public static void arrayListRemoveTiming(){
    long startTime, midPointTime, stopTime;
    // Spin the computer until one second has gone by, this allows this
    // thread to stabilize;
    startTime = System.nanoTime();
    while (System.nanoTime() - startTime < 1000000000) {
    }
    long timesToLoop = 100000;
    int N;
    ArrayList<Integer> list = new ArrayList<Integer>();

    // Fill up the array with 0 to 10000
    for (N = 0; N < timesToLoop; N++)
        list.add(N);

    startTime = System.nanoTime();
    for (int i = 0; i < list.size() ; i++) {
        list.remove(i);

        midPointTime = System.nanoTime();
        // Run an Loop to capture other cost.
        for (int j = 0; j < timesToLoop; j++) {

        }
        stopTime = System.nanoTime();

        // Compute the time, subtract the cost of running the loop
        // from the cost of running the loop.

        double averageTime = ((midPointTime - startTime) - (stopTime - midPointTime))
                / timesToLoop;

        System.out.println(averageTime);
    }


}

The cost of a remove is O(n) as you have to shuffle the elements to the "right" of that point "left" by one:

                 Delete D
                     |
                     V
+-----+-----+-----+-----+-----+-----+-----+
|  A  |  B  |  C  |  D  |  E  |  F  |  G  |
+-----+-----+-----+-----+-----+-----+-----+
                     <------------------
                      Move E, F, G left

If your test code is giving you O(1) then I suspect you're not measuring it properly :-)

The OpenJDK source, for example, has this:

public E remove(int index) {
    rangeCheck(index);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // Let gc do its work

    return oldValue;
}

and the System.arraycopy is the O(n) cost for this function.


In addition, I'm not sure you've thought this through very well:

for (int i = 0; i < list.size() ; i++)
    list.remove(i);

This is going to remove the following elements from the original list:

    0, 2, 4, 8

and so on, because the act of removing element 0 shifts all other elements left - the item that was originally at offset 1 will be at offset 0 when you've deleted the original offset 0, and you then move on to delete offset 1.

First off, you are not measuring complexity in this code. What you are doing is measuring (or attempting to measure) performance. When you graph the numbers (assuming that they are correctly measured) you get a performance curve for a particular use-case over a finite range of values for your scaling variable.

That is not the same as a computational complexity measure; ie big O, or related Bachman-Landau notations . These are about mathematical limits as the scaling variable tends to infinity.

And this is not just a nitpick. It is quite easy to construct examples 1 where performance characteristics change markedly as N gets very large.

What are doing when you graph performance over a range of values and fit a curve is to estimate the complexity.

1 - And a real example is the average complexity of various HashMap functions which switch from O(1) to O(N) (with a very small C ) when N reaches 2^31 . The modality is because the hash array cannot grow beyond 2^31 slots.


The second point is that that the complexity of ArrayList.remove(index) is sensitive to the value of index as well as the list length.

  • The "advertised" complexity of O(N) for the average and worst cases.

  • In the best case, the complexity is actually O(1) . Really!

    This happens when you remove the last element of the list; ie index == list.size() - 1 . That can be performed with zero copying; look at the code that @paxdiablo included in his Answer.


Now to your Question. There are a number of reasons why your code could give incorrect measurements. For example:

  • You are not taking account of JIT compilation overheads and other JVM warmup effects.

  • I can see places where the JIT compiler could potentially optimize away entire loops.

  • The way you are measuring the time is strange. Try treating this as algebra.

     ((midPoint - start) - (stop - midPoint)) / count;

    Now simplify ... and the midPoint term cancels out.

  • You are only removing half of the elements from the list, so you only measuring over the range 50,000 to 100,000 of your scaling variable. (And I expect you are then plotting against the scaling variable; ie you are plotting f(N + 5000) against N.

  • The time intervals you are measuring could be too small for the clock resolution on your machine. (Read the javadocs for nanoTime() to see what resolution it guarantees.)

I recommend that people wanting to avoid mistakes like the above should read:

remove(int) removes the element at the ith INDEX which is O(1)

You probably want remove( Object ) which is O(N) you would need to call remove(Integer.valueOf(i))

it would be more obvious if your list didn't have the elements in order

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM