简体   繁体   English

TreeSet:有效地小于一个值的元素数

[英]TreeSet: number of elements less than a value efficiently

I need a way to calculate the number of elements less than X in a TreeSet of Integers really fast. 我需要一种方法来真正快速地计算出整数的TreeSet中少于X的元素数。

I can use the 我可以用

  • subSet() subSet()
  • headSet() 耳机()
  • tailSet() tailSet()

methods but they are really slow (I just need the count, not the numbers themselves). 方法,但它们确实很慢(我只需要计数,而不是数字本身)。 Is there a way? 有办法吗?

Thank you. 谢谢。


EDIT: 编辑:

I found a workaround that makes things a lot faster! 我发现了一种变通方法,可以使事情更快! I am using BitSet and it's cardinality() method. 我正在使用BitSet及其cardinality()方法。 I create a BitSet at first and for every element added to the TreeSet I set the corresponding index in BitSet. 我首先创建一个BitSet,然后为添加到TreeSet中的每个元素设置BitSet中的相应索引。 Now, to count the number of elements less than XI use: 现在,要计算少于XI的元素数量,请使用:

bitset.get(0, X+1).cardinality() bitset.get(0,X + 1).cardinality()

This is much faster compared with treeset.subSet(0, true, X, true).size(). 与treeset.subSet(0,true,X,true).size()相比,这要快得多。

Anyone knows why? 有人知道为什么吗? I assume BitSet.cardinality() doesn't use linear search. 我假设BitSet.cardinality()不使用线性搜索。

How fast does 'really fast' need to be? “真正快速”需要多快? Roughly how many elements do you have? 您大致有多少个元素?

subSet()/headSet()/tailSet() are O(1) because they return a view of the original treeset, but if you size() your subSet() you are still iterating over all the original elements, hence O(N). subSet()/headSet()/tailSet()为O(1),因为它们返回原始树集的视图,但是如果您对您的subSet() size() ,则您仍在遍历所有原始元素,因此O(N )。

Are you using Java 8? 您正在使用Java 8吗? This will be about the same but you can parallelise the cost. 这将大致相同,但是您可以并行化成本。

Set<Integer> set = new TreeSet<>();
// .. add things to set

long count = set.parallelstream().filter(e -> e < x).count();

NB EDIT 注意编辑

With further exploration and testing I cannot substantiate the claim "if you size() your subSet() you are still iterating over all the original elements". 随着进一步的探索和试验,我不能证明的要求:“如果你size()subSet()你还在遍历所有的原始元素”。 I was wrong. 我错了。 parallelstream().count() on this 4 core machine was ~30% slower than subSet().size() 在这4核机器上的parallelstream().count()subSet().size()慢30%

If you don't update the data structure, just keep the number of elements less than X in a hashmap! 如果您不更新数据结构,则只需在哈希图中使元素数少于X!

If you update it not frequently, keep a sorted linked list of numbers. 如果不经常更新,请保留一个排序的数字链接列表。 At insert/remove, add/remove from list in O(1) and update the hashmap (O(n)). 在插入/删除时,从O(1)中的列表添加/删除并更新哈希图(O(n))。

You can have O(Log(n)) get and O(Log(n)) update, by using a (sorted) binary tree. 通过使用(排序的)二叉树,可以获取O(Log(n))和更新O(Log(n))。 In each element of the tree, also keep the number of its descendants. 在树的每个元素中,还保留其后代的数量。 Now to get # items < than y, you find it in the binary tree, but also sum the number of elements whenever you go right instead of left. 现在,要获得#个项目<小于y,您可以在二叉树中找到它,而且无论何时右移而不是左移,都可以求和元素的数量。 At update you need to update the ancestors of the new element too. 在更新时,您还需要更新新元素的祖先。

By the way, if you are willing to accept approximate answers, there could be faster ways too. 顺便说一句,如果您愿意接受大概的答案,那么也可以有更快的方法。

Since all answers so far point to data structures different than Java's TreeSet , I would suggest the Fenwick tree, which has O(log(N)) for updates and queries; 由于到目前为止所有答案都指向与Java的TreeSet不同的数据结构,因此我建议使用Fenwick树,该树具有O(log(N))用于更新和查询; see the link for Java implementation. 请参阅Java实现链接

package ArrayListTrial;

import java.util.Scanner;

public class countArray {

    public static void main(String[] args) {
        // TODO Auto-generated method stub

        int[] array = new int[100];
        Scanner scan = new Scanner(System.in);
        System.out.println("input the number you want to compare:");
        int in = scan.nextInt();
        int count = 0;
        System.out.println("The following is array elements:");
        for(int k=0 ; k<array.length ; k++)
        {
            array[k] = k+1;
            System.out.print(array[k] + " ");
            if(array[k] > in)
            {
                count++;
            }
        }
        System.out.printf("\nThere are %d numbers in the array bigger than %d.\n" , count , in);

    }

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM