简体   繁体   English

使用相同类型的对象(Java8)通过2个不同的List有效地进行迭代

[英]Iterate efficiently through 2 different List with same Type of Object(Java8)

I have two list containing an important number of object with each N elements: 我有两个列表,每个列表包含N元素的重要数量:

List<Foo> objectsFromDB = {{MailId=100, Status=""}, {{MailId=200, Status=""}, {MailId=300, Status=""} ... {MailId=N , Status= N}}

List <Foo> feedBackStatusFromCsvFiles = {{MailId=100, Status= "OPENED"}, {{MailId=200, Status="CLICKED"}, {MailId=300, Status="HARDBOUNCED"} ... {MailId=N , Status= N}} 

Little Insights: objectFromDB retrieves row of my database by calling a Hibernate method. 一点见解: objectFromDB通过调用Hibernate方法检索数据库的行。

feedBackStatusFromCsvFiles calls a CSVparser method and unmarshall to Java objects. feedBackStatusFromCsvFiles调用CSVparser方法并解组到Java对象。

My entity class Foo has all setters and getters. 我的实体类Foo具有所有设置方法和获取方法。 So I know that the basic idea is to use a foreach like this: 所以我知道基本的想法是使用这样的foreach:

     for (Foo fooDB : objectsFromDB) {
          for(Foo fooStatus: feedBackStatusFromCsvFiles){
              if(fooDB.getMailId().equals(fooStatus.getMailId())){
                    fooDB.setStatus(fooStatus.getStatus());
                }
               }
            }

As far as my modest knowledge of junior developer is, I think it is a very bad practice doing it like this? 就我对初级开发人员的一点了解而言,我认为这样做是非常不好的做法? Should I implement a Comparator and use it for iterating on my list of objects? 我应该实现一个比较器并将其用于迭代对象列表吗? Should I also check for null cases? 我还应该检查无效的情况吗?

Thanks to all of you for your answers! 感谢大家的回答!

Assuming Java 8 and considering the fact that feedbackStatus may contain more than one element with the same ID. 假设使用Java 8,并考虑以下事实:feedbackStatus可能包含多个具有相同ID的元素。

  1. Transform the list into a Map using ID as key and having a list of elements. 使用ID作为键并具有元素列表将列表转换为Map。
  2. Iterate the list and use the Map to find all messages. 遍历列表,并使用Map查找所有消息。

The code would be: 该代码将是:

final Map<String, List<Foo>> listMap = 
objectsFromDB.stream().collect(
      Collectors.groupingBy(item -> item.getMailId())
);

for (final Foo feedBackStatus : feedBackStatusFromCsvFiles) {
        listMap.getOrDefault(feedBackStatus.getMailId(), Colleactions.emptyList()).forEach(item -> item.setStatus(feedBackStatus.getStatus()));
}

Use maps from collections to avoid the nested loops. 使用集合中的映射来避免嵌套循环。

    List<Foo> aList = new ArrayList<>();
    List<Foo> bList = new ArrayList<>();
    for(int i = 0;i<5;i++){
        Foo foo = new Foo();
        foo.setId((long) i);
        foo.setValue("FooA"+String.valueOf(i));
        aList.add(foo);
        foo = new Foo();
        foo.setId((long) i);
        foo.setValue("FooB"+String.valueOf(i));
        bList.add(foo);
    }

    final Map<Long,Foo> bMap = bList.stream().collect(Collectors.toMap(Foo::getId, Function.identity()));

    aList.stream().forEach(it->{
        Foo bFoo = bMap.get(it.getId());
        if( bFoo != null){
            it.setValue(bFoo.getValue());
        }
    });

The only other solution would be to have the DTO layer return a map of the MailId->Foo object, as you could then use the CVS list to stream, and simply look up the DB Foo object. 唯一的其他解决方案是让DTO层返回MailId-> Foo对象的映射,因为您随后可以使用CVS列表进行流传输,并只需查找DB Foo对象即可。 Otherwise, the expense of sorting or iterating over both of the lists is not worth the trade-offs in performance time. 否则,对两个列表进行排序或迭代的开销不值得在性能时间上进行权衡。 The previous statement holds true until it definitively causes a memory constraint on the platform, until then let the garbage collector do its job, and you do yours as easy as possible. 前面的语句一直适用,直到它最终在平台上引起内存限制,然后再让垃圾收集器执行其工作,并且您尽可能轻松地完成工作。

Given that your lists may contain tens of thousands of elements, you should be concerned that you simple nested-loop approach will be too slow. 鉴于您的列表可能包含成千上万的元素,因此您应该担心简单的嵌套循环方法会太慢。 It will certainly perform a lot more comparisons than it needs to do. 它肯定会比需要做的更多的比较。

If memory is comparatively abundant, then the fastest suitable approach would probably be to form a Map from mailId to (list of) corresponding Foo from one of your lists, somewhat as @MichaelH suggested, and to use that to match mailIds. 如果内存相对充足,那么最快的合适方法可能是从您的一个列表中的mailId到对应的Foo (列表)形成Map,有点像@MichaelH所建议的,并使用它来匹配mailId。 If mailId values are not certain to be unique in one or both lists, however, then you'll need something a bit different than Michael's specific approach. 但是,如果mailId值在一个或两个列表中唯一,那么您将需要与Michael的特定方法有所不同的东西。 Even if mailId s are sure to be unique within both lists, it will be a bit more efficient to form only one map. 即使mailId肯定在两个列表中都是唯一的,但仅形成一个映射会更有效率。

For the most general case, you might do something like this: 在最一般的情况下,您可以执行以下操作:

// The initial capacity is set (more than) large enough to avoid any rehashing
Map<Long, List<Foo>> dbMap = new HashMap<>(3 * objectFromDb.size() / 2);

// Populate the map
// This could be done more effciently if the objects were ordered by mailId,
// which perhaps the DB could be enlisted to ensure.
for (Foo foo : objectsFromDb) {
    Long mailId = foo.getMailId();
    List<Foo> foos = dbMap.get(mailId);

    if (foos == null) {
        foos = new ArrayList<>();
        dbMap.put(mailId, foos);
    }
    foos.add(foo);
}

// Use the map
for (Foo fooStatus: feedBackStatusFromCsvFiles) {
    List<Foo> dbFoos = dbMap.get(fooStatus.getMailId());

    if (dbFoos != null) {
        String status = fooStatus.getStatus();

        // Iterate over only the Foos that we already know have matching Ids
        for (Foo fooDB : dbFoos) {
            fooDB.setStatus(status);
        }
    }
}

On the other hand, if you are space-constrained, so that creating the map is not viable, yet it is acceptable to reorder your two lists, then you should still get a performance improvement by sorting both lists first. 另一方面,如果空间有限,那么创建地图是不可行的,但是可以对两个列表重新排序是可以接受的,那么仍然应该通过首先对两个列表进行排序来提高性能。 Presumably you would use Collections.sort() with an appropriate Comparator for this purpose. 大概为此目的,您可以将Collections.sort()与适当的Comparator一起使用。 Then you would obtain an Iterator over each list, and use them to iterate cooperatively over the two lists. 然后,您将在每个列表上获得一个Iterator ,并使用它们在两个列表上进行协同迭代。 I present no code, but it would be reminiscent of the merge step of a merge sort (but the two lists are not actually merged; you only copy status information from one to the other). 我没有提供任何代码,但这会使人想起合并排序的合并步骤(但实际上两个列表并未合并;您仅将状态信息从一个复制到另一个)。 But this makes sense only if the mailId s from feedBackStatusFromCsvFiles are all distinct, for otherwise the expected result of the whole task is not well determined. 但这仅在mailIdfeedBackStatusFromCsvFiles完全不同的情况下才有意义,否则整个任务的预期结果将无法确定。

your problem is merging Foo's last status into Database objects.so you can do it in two steps that will make it more clearly & readable. 您的问题是将Foo的最后状态合并到Database对象中。因此,您可以分两个步骤进行操作,以使其更加清晰易读。

  1. filtering Foos that need to merge. 过滤需要合并的Foos。
  2. merging Foos with last status. 将Foos与最后状态合并。

     //because the status always the last,so you needn't use groupingBy methods to create a complex Map. Map<String, String> lastStatus = feedBackStatusFromCsvFiles.stream() .collect(toMap(Foo::getMailId, Foo::getStatus , (previous, current) -> current)); //find out Foos in Database that need to merge Predicate<Foo> fooThatNeedMerge = it -> lastStatus.containsKey(it.getMailId()); //merge Foo's last status from cvs. Consumer<Foo> mergingFoo = it -> it.setStatus(lastStatus.get(it.getMailId())); objectsFromDB.stream().filter(fooThatNeedMerge).forEach(mergingFoo); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM