简体   繁体   English

查找随机数的中位数

[英]Find median of randomly generated numbers

Qn (from cracking coding interview page 91) Numbers are randomly generated and passed to a method. Qn(摘自第91页的破解编码采访),数字是随机生成的,并传递给方法。 Write a program to find and maintain the median value as new values are generated. 编写程序以在生成新值时查找并维护中间值。

My question is: Why is it that if maxHeap is empty, it's okay to return minHeap.peek() and vice versa in getMedian() method below? 我的问题是:如果maxHeap为空,为什么可以在下面的getMedian()方法中返回minHeap.peek(),反之亦然呢?

Doesn't this violate the property of finding a median? 这不违反求中位数的属性吗?

I am using the max heap/min heap method to solve the problem. 我正在使用最大堆/最小堆方法来解决此问题。 The solution given is as below: 给出的解决方案如下:

private static Comparator<Integer> maxHeapComparator, minHeapComparator;
    private static PriorityQueue<Integer> maxHeap, minHeap;

    public static void addNewNumber(int randomNumber) {
        if (maxHeap.size() == minHeap.size()) {
            if ((minHeap.peek() != null)
                    && randomNumber > minHeap.peek()) {
                maxHeap.offer(minHeap.poll());
                minHeap.offer(randomNumber);
            } else {
                maxHeap.offer(randomNumber);
            }
        } else {
            if (randomNumber < maxHeap.peek()) {
                minHeap.offer(maxHeap.poll());
                maxHeap.offer(randomNumber);
            } else {
                minHeap.offer(randomNumber);
            }
        }
    }

    public static double getMedian() {
        if (maxHeap.isEmpty()) {
            return minHeap.peek();
        } else if (minHeap.isEmpty()) {
            return maxHeap.peek();
        }
        if (maxHeap.size() == minHeap.size()) {
            return (minHeap.peek() + maxHeap.peek()) / 2;
        } else if (maxHeap.size() > minHeap.size()) {
            return maxHeap.peek();
        } else {
            return minHeap.peek();
        }
    }

The method has a shortcoming that it does not work in situations when both heaps are empty. 该方法有一个缺点,即在两个堆都是空的情况下它不起作用。

To fix, the method signature needs to be changed to return a Double (with the uppercase 'D') Also a check needs to be added to return null when both heaps are empty. 要解决此问题,需要将方法签名更改为返回Double(大写字母为“ D”),并且当两个堆都为空时,还需要添加检查以返回null。 Currently, an exception on a failed attempt to convert null to double will be thrown. 当前,将尝试将null转换为double的尝试失败将引发异常。

Another shortcoming is integer division when the two heaps have identical sizes. 另一个缺点是两个堆的大小相同时的整数除法。 You need a cast to make it double - afetr all, that was the whole point behind making a method that finds a median of integers return a double in the first place. 您需要进行强制转换以使其倍增-毕竟,这是使找到整数中位数的方法首先返回双精度的方法的全部要点。

Another disadvantage with this approach is that it doesn't scale well, for example to heap sizes that don't fit in memory. 这种方法的另一个缺点是它不能很好地扩展,例如无法适应内存中的堆大小。

A very good approximation algorithm is simply storing an approximate median with a fixed increment (eg. 0.10), chosen appropriate to the scale of the problem. 一个非常好的近似算法只是简单地存储一个固定的增量(例如0.10)的近似中位数,该增量选择为适合问题的规模。 For each value, if the value is higher, add 0.10. 对于每个值,如果该值较高,则添加0.10。 If the value is lower, subtract 0.10. 如果该值较低,则减去0.10。 The result approximates the median, scales well, and can be stored in 4 or 8 bytes. 结果近似于中位数,可以很好地缩放,并且可以存储为4或8个字节。

只是这样做...否则一切正确:

return new Double(minHeap.peek() + maxHeap.peek()) / 2.0;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM