简体   繁体   中英

Find median of randomly generated numbers

Qn (from cracking coding interview page 91) Numbers are randomly generated and passed to a method. Write a program to find and maintain the median value as new values are generated.

My question is: Why is it that if maxHeap is empty, it's okay to return minHeap.peek() and vice versa in getMedian() method below?

Doesn't this violate the property of finding a median?

I am using the max heap/min heap method to solve the problem. The solution given is as below:

private static Comparator<Integer> maxHeapComparator, minHeapComparator;
    private static PriorityQueue<Integer> maxHeap, minHeap;

    public static void addNewNumber(int randomNumber) {
        if (maxHeap.size() == minHeap.size()) {
            if ((minHeap.peek() != null)
                    && randomNumber > minHeap.peek()) {
                maxHeap.offer(minHeap.poll());
                minHeap.offer(randomNumber);
            } else {
                maxHeap.offer(randomNumber);
            }
        } else {
            if (randomNumber < maxHeap.peek()) {
                minHeap.offer(maxHeap.poll());
                maxHeap.offer(randomNumber);
            } else {
                minHeap.offer(randomNumber);
            }
        }
    }

    public static double getMedian() {
        if (maxHeap.isEmpty()) {
            return minHeap.peek();
        } else if (minHeap.isEmpty()) {
            return maxHeap.peek();
        }
        if (maxHeap.size() == minHeap.size()) {
            return (minHeap.peek() + maxHeap.peek()) / 2;
        } else if (maxHeap.size() > minHeap.size()) {
            return maxHeap.peek();
        } else {
            return minHeap.peek();
        }
    }

The method has a shortcoming that it does not work in situations when both heaps are empty.

To fix, the method signature needs to be changed to return a Double (with the uppercase 'D') Also a check needs to be added to return null when both heaps are empty. Currently, an exception on a failed attempt to convert null to double will be thrown.

Another shortcoming is integer division when the two heaps have identical sizes. You need a cast to make it double - afetr all, that was the whole point behind making a method that finds a median of integers return a double in the first place.

Another disadvantage with this approach is that it doesn't scale well, for example to heap sizes that don't fit in memory.

A very good approximation algorithm is simply storing an approximate median with a fixed increment (eg. 0.10), chosen appropriate to the scale of the problem. For each value, if the value is higher, add 0.10. If the value is lower, subtract 0.10. The result approximates the median, scales well, and can be stored in 4 or 8 bytes.

只是这样做...否则一切正确:

return new Double(minHeap.peek() + maxHeap.peek()) / 2.0;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM