简体   繁体   English

TreeSet比较器

[英]TreeSet Comparator

I have a TreeSet and a custom comparator. 我有一个TreeSet和一个自定义比较器。 I get the values from server according to the changes in the stock 我根据库存变化从服务器获取值

ex: if time=0 then server will send all the entries on the stock (unsorted) if time=200 then server will send entries added or deleted after the time 200(unsorted) 例如:如果时间= 0,则服务器将发送库存中的所有条目(未排序);如果时间= 200,则服务器将发送在时间200(未排序)之后添加或删除的条目。

In client side i am sorting the entries. 在客户端,我正在对条目进行排序。 My question is which is more efficient 我的问题是哪个更有效

1> fetch all entries first and then call addAll method or 2> add one by one 1>首先获取所有条目,然后调用addAll方法或2>逐一添加

there can be millions of entries. 可能有数百万个条目。

/////////updated/////////////////////////////////// /////////更新///////////////////////////////////

  private static Map<Integer, KeywordInfo> hashMap = new HashMap<Integer, KeywordInfo>();
  private static Set<Integer> sortedSet = new TreeSet<Integer>(comparator);

      private static final Comparator<Integer> comparator = new Comparator<Integer>() {
        public int compare(Integer o1, Integer o2) {
          int integerCompareValue = o1.compareTo(o2);
          if (integerCompareValue == 0) return integerCompareValue;
          KeywordInfo k1 = hashMap.get(o1);
          KeywordInfo k2 = hashMap.get(o2);
          if (null == k1.getKeyword()) {
            if (null == k2.getKeyword())
              return integerCompareValue;
            else
              return -1;
          } else {
            if (null == k2.getKeyword())
              return 1;
            else {
              int compareString = AlphaNumericCmp.COMPARATOR.compare(k1.getKeyword().toLowerCase(), k2.getKeyword().toLowerCase());
              //int compareString =  k1.getKeyword().compareTo(k2.getKeyword());
              if (compareString == 0)
                return integerCompareValue;
              return compareString;
            }
          }
        }
      };

now there is an event handler which gives me an ArrayList of updated entries, after adding them to my hashMap i am calling 现在有一个事件处理程序,它向我提供了更新条目的ArrayList,将它们添加到我正在调用的hashMap之后

final Map<Integer, KeywordInfo> mapToReturn = new SubMap<Integer, KeywordInfo>(sortedSet, hashMap);

The actual implementation is a linked list, so add one by one will be faster if you do it right. 实际的实现是一个链表,因此,如果操作正确,一个接一个地添加将更快。 And i think in the near future this behaviour wont be change. 而且我认为在不久的将来这种行为不会改变。

For your problem a Statefull comparator may help. 对于您的问题,Statefull比较器可能会有所帮助。

// snipplet, must not work fine
public class NaturalComparator implements Comparator{
    private boolean anarchy = false;
    private Comparator parentComparator;

    NaturalComparator(Comparator parent){
       this.parentComparator = parent;
    }
    public void setAnarchy(){...}
    public int compare(A a, A b){
      if(anarchy) return 1
      else return parentCoparator.compare(a,b);
    }
}
...
Set<Integer> sortedSet = new TreeSet<Integer>(new NaturalComparator(comparator));
comparator.setAnarchy(true);
sortedSet.addAll(sorted);
comparator.setAnarchy(false);

I think your bottleneck can be probably more network-related than CPU related. 我认为您的瓶颈可能与网络相关而不是与CPU相关。 A bulk operation fetching all the new entries at once would be more network efficient. 一次提取所有新条目的批量操作将提高网络效率。

With regards to your CPU, the time required to populate a TreeSet does not change consistently between multiple add()s and addAll(). 关于您的CPU,填充TreeSet所需的时间在多个add()和addAll()之间不会一致地改变。 The reason behind is that TreeSet relies on AbstractCollection's addAll() ( http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/AbstractCollection.java#AbstractCollection.addAll%28java.util.Collection%29 ) which in turn creates an iterator and calls multiple times add(). 背后的原因是TreeSet依赖AbstractCollection的addAll()( http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/AbstractCollection.java#AbstractCollection .addAll%28java.util.Collection%29 ),依次创建一个迭代器并多次调用add()。

So, my advice on the CPU side is: choose the way that keeps your code cleaner and more readable. 因此,我在CPU方面的建议是:选择一种使代码更清洁和更具可读性的方式。 This is probably obtained through addAll(). 这可能是通过addAll()获得的。

In general it is less memory overhead when on being loaded alread data is stored. 通常,在装载已存储的数据时,其内存开销较小。 This should be time efficient too, maybe using small buffers. 这也应该是省时的,也许使用小的缓冲区。 Memory allocation costs time too. 内存分配也会花费时间。

However time both solutions, in a separate prototype. 但是,请在单独的原型中同时使用两种解决方案。 You really have to test with huge numbers, as network traffic costs much too. 您确实必须进行大量测试,因为网络流量也会花费很多。 That is a bit Test Driven Development, and adds to QA both quantitative statistics, as correctness of implementation. 这有点受测试驱动的开发的影响,并且为实施的正确性增加了两个定量统计信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM