简体   繁体   English

Java未排序的ArrayList的Doubles,如何获取最大值的索引?

[英]Java unsorted ArrayList of Doubles, how to get indexes of highest values?

Guys I have an ArrayList which contains approx 3000 double values. 伙计们,我有一个包含大约3000个双精度值的ArrayList。

I basically need the ordered indexes of the top 100 doubles in the ArrayList. 我基本上需要ArrayList中前100个双打的有序索引。 I am not concerned with the actual values of the top 100, just their indexes in the order of maximum to minimum. 我不关心前100名的实际值,而只关心它们从最大到最小的顺序。

For example if the largest values (from max to min) in the ArrayList are index50, index27 and index96, then I am only concerend with 50, 27, 96, in THAT exact order. 例如,如果ArrayList中的最大值(从最大值到最小值)是index50,index27和index96,那么我只同意50、27、96(确切的顺序)。

Code for the ArrayList: ArrayList的代码:

ArrayList<Double> ids   = new ArrayList<Double>();

The resulting set or list of indexes may be contained in ANY data structure which maintains the order of 50, 27, 96, such as an ArrayList or any other collection type. 所得的索引集或列表可以包含在任何数据结构中,该数据结构保持50、27、96的顺序,例如ArrayList或任何其他集合类型。

In Summary: 综上所述:

How do I return the index numbers of the highest 100 values (doubles) in an ArrayList? 如何返回ArrayList中最高100个值(双精度)的索引号?

Any assistance appreciated guys, 任何帮助表示赞赏的家伙,

You can add all the value (as key) index (as value) pairs to a TreeMap (or other SortedMap s) SortedMap.values returns the values (ie, the indeces) in the sorted order. 您可以将所有值(作为键)索引(作为值)对添加到TreeMap(或其他SortedMap )中SortedMap.values以排序顺序返回值(即indeces)。

Edit: This will not work if there are duplicates in your list, as the second put will overwrite the previously stored value (index). 编辑:如果列表中有重复项,则此方法将不起作用,因为第二个看跌期权将覆盖先前存储的值(索引)。 So the following seems better: 所以以下似乎更好:

Create Pairs of index and value, add them to a SortedSet (as suggested bu StKiller below), using a Comparator that sorts by value and then by index (to be consistent with equals as the API-doc puts it). 创建一个索引对和值对,使用比较器将它们添加到SortedSet中(如下面建议的StKiller所述),该比较器按值排序,然后按索引排序(以 API文档中的equals一致 )。 Then just take the first 100 pairs, or rather the indeces stored in those. 然后只取前100对,或者存储在其中的inde。

Edit 2: Actually, you don't really need the pairs, you can use the Comparator for indeces to look up the values ... 编辑2:实际上,您实际上并不需要配对,可以使用Comparator进行索引来查找值...

I guess Insertion sort runs in O(n^2) time. 我猜想插入排序在O(n ^ 2)时间中运行。 Use heap sort which runs in O(nlog(n)) time. 使用运行时间为O(nlog(n))的堆排序。 Use a min heap of 100 nodes. 使用最少100个节点的堆。 When you iterate over your list compare the value to the root. 当您遍历列表时,将值与根进行比较。 If it is larger, replace the root and run the heapify algorithm. 如果更大,请替换根并运行heapify算法。

After you finish with all the elements, your heap will contain the top 100 elements. 完成所有元素后,堆将包含前100个元素。

Usage of a proper data structure for the heap will let you keep the indices as well along with the value. 通过为堆使用适当的数据结构,可以使索引与值一起保留。

An example might be 一个例子可能是

class MinHeapNode
{
    public int value;
    public int index;
    public MinHeapNode left;
    public MinHeapNode right;
}

I would argue that if you only need the top 100 values, why not use an inverted selection sort that cuts off after 100 iterations? 我会争辩说,如果只需要前100个值,为什么不使用在100次迭代后就中断的反向选择排序呢? Selection sort guarantees that one value will be put into the correct position on each pass, so after 100 runs through the list the top values should be the one you want. 选择排序可确保每次通过都会将一个值放置在正确的位置,因此在列表中运行100次后,最上面的值应该是您想要的值。 I'm sure a more elegant solution exists, but this should be simple to implement. 我敢肯定,存在更优雅的解决方案,但这应该很容易实现。

import java.util.*;

After all this talk of O(thing) for sorting, I thought I should show that actually the insertion sort is the best in this case. 在讨论了用于排序的O(thing)之后,我想我应该证明在这种情况下,插入排序实际上是最好的。 The code below shows various suggestions from this page and my own ideas. 下面的代码显示了此页面上的各种建议以及我自己的想法。 The relative performances are: 相对表现为:

Insert sort: 61 480ns 插入排序:61 480ns

Object sort: 1 147 538ns 对象排序:1 147 538ns

Sorted set: 671 007ns 排序集:671007ns

Limited set: 435 130ns 限量版:435130ns

public class DoubleIndexSort {

    static class DI implements Comparable<DI> {
        final int index;

        final double val;


        DI(double v, int i) {
            val = v;
            index = i;
        }


        public int compareTo(DI other) {
            if (val < other.val) {
                return 1;
            } else if (val == other.val) {
                return 0;
            }
            return -1;
        }
    }



    public static void checkResult(double[] test, int[] indexes) {
        for(int i = 0;i < indexes.length;i++) {
            int ii = indexes[i];
            double iv = test[ii];
            // System.out.println("Checking " + i + " -> " + ii + " = " + iv);
            for(int j = 0;j < test.length;j++) {
                // System.out.println(j + " -> " + test[j]);
                if (j != ii && test[j] > iv) throw new RuntimeException();
            }
            test[ii] = -1;
        }
    }


    public static int[] getHighestIndexes(double[] data, int topN) {
        if (data.length <= topN) {
            return sequence(topN);
        }
        int[] bestIndex = new int[topN];
        double[] bestVals = new double[topN];

        bestIndex[0] = 0;
        bestVals[0] = data[0];

        for(int i = 1;i < topN;i++) {
            int j = i;
            while( (j > 0) && (bestVals[j - 1] < data[i]) ) {
                bestIndex[j] = bestIndex[j - 1];
                bestVals[j] = bestVals[j - 1];
                j--;
            }
            bestVals[j] = data[i];
            bestIndex[j] = i;
        }

        for(int i = topN;i < data.length;i++) {
            if (bestVals[topN - 1] < data[i]) {
                int j = topN - 1;
                while( (j > 0) && (bestVals[j - 1] < data[i]) ) {
                    bestIndex[j] = bestIndex[j - 1];
                    bestVals[j] = bestVals[j - 1];
                    j--;
                }
                bestVals[j] = data[i];
                bestIndex[j] = i;
            }
        }

        return bestIndex;
    }


    public static int[] getHighestIndexes2(double[] data, int topN) {
        if (data.length <= topN) {
            return sequence(topN);
        }
        DI[] di = new DI[data.length];
        for(int i = 0;i < data.length;i++) {
            di[i] = new DI(data[i], i);
        }
        Arrays.sort(di);        

        int[] res = new int[topN];
        for(int i = 0;i < topN;i++) {
            res[i] = di[i].index;
        }
        return res;
    }


    public static int[] getHighestIndexes3(double[] data, int topN) {
        if (data.length <= topN) {
            return sequence(topN);
        }
        SortedSet<DI> set = new TreeSet<DI>();
        for(int i=0;i<data.length;i++) {
            set.add(new DI(data[i],i));
        }
        Iterator<DI> iter = set.iterator();
        int[] res = new int[topN];
        for(int i = 0;i < topN;i++) {
            res[i] = iter.next().index;
        }
        return res;
    }


    public static int[] getHighestIndexes4(double[] data, int topN) {
        if (data.length <= topN) {
            return sequence(topN);
        }
        SortedSet<DI> set = new TreeSet<DI>();
        for(int i=0;i<data.length;i++) {
            set.add(new DI(data[i],i));
            if( set.size() > topN ) {
                set.remove(set.last());
            }
        }
        Iterator<DI> iter = set.iterator();
        int[] res = new int[topN];
        for(int i = 0;i < topN;i++) {
            res[i] = iter.next().index;
        }
        return res;
    }


    /**
     * @param args
     */
    public static void main(String[] args) {
        long elap1 = 0;
        long elap2 = 0;
        long elap3 = 0;
        long elap4 = 0;
        for(int i = 1;i <= 1000;i++) {
            double[] data = testData();
            long now = System.nanoTime();
            int[] inds = getHighestIndexes(data, 100);
            elap1 += System.nanoTime() - now;
            checkResult(data, inds);
            System.out.println("\nInsert sort: "+(elap1 / i));

            now = System.nanoTime();
            inds = getHighestIndexes2(data, 100);
            elap2 += System.nanoTime() - now;
            checkResult(data, inds);
            System.out.println("Object sort: "+(elap2 / i));

            now = System.nanoTime();
            inds = getHighestIndexes3(data, 100);
            elap3 += System.nanoTime() - now;
            checkResult(data, inds);
            System.out.println("Sorted set:  "+(elap3 / i));

            now = System.nanoTime();
            inds = getHighestIndexes4(data, 100);
            elap4 += System.nanoTime() - now;
            checkResult(data, inds);
            System.out.println("Limited set: "+(elap4 / i));
        }
    }


    private static int[] sequence(int n) {
        int[] indexes = new int[n];
        for(int i = 0;i < n;i++) {
            indexes[i] = i;
        }
        return indexes;
    }


    public static double[] testData() {
        double[] test = new double[3000];
        for(int i = 0;i < test.length;i++) {
            test[i] = Math.random();
        }
        return test;
    }
}

Use insertion sort. 使用插入排序。 This can be done in O(n^2). 这可以在O(n ^ 2)中完成。 Maintain an List which holds the top 100 values of the ArrayList you have. 维护一个列表,其中包含您拥有的ArrayList的前100个值。 Loop through the ArrayList you have and use Insertion sort to place top elements in the new ArrayList. 遍历您拥有的ArrayList并使用插入排序将顶部元素放置在新的ArrayList中。

In a language like scala you could simply use zipWithIndex , sortWith , take (n) and map : 在像scala这样的语言中,您可以简单地使用zipWithIndexsortWithtake (n)map

val ids = List (2.0, 2.5, 1.5, 0.5, 7.5, 7.0, 1.0, 8.0, 4.0, 1.0);
ids.zipWithIndex.sortWith ((x, y) => (x._1 >  y._1)).take (3).map (vi => vi._2)
res65: List[Int] = List(7, 4, 5)

However, in Java you have to do more boilerplate code, if calling scala (which is 100% compatible to java) is no option. 但是,在Java中,如果不能选择调用scala(与Java 100%兼容),则必须执行更多样板代码。

However, a nearly as simple solution could be possible with functional java (see API, List). 但是,使用功能性Java可以实现几乎相同的解决方案(请参阅API,List)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM