從兩個不同的ArrayList查找唯一交點的最有效方法？

Question

我有兩個Arraylist，A和B。

ArrayList A由包含一組數據的類組成，這些數據包括一個稱為categoryID的標識符。 A中的多個項目可以具有相同的categoryID 。 對於A中的每個項目，類別ID可能看起來像這樣： [1, 1, 2, 2, 3, 4, 7] 。

ArrayList B由不同的類組成，這些類包含不同的數據集，包括categoryID 。 categoryID對於此列表中的每個項目都是唯一的。 示例： [1, 2, 3, 4, 5, 6, 7] 。

這兩個列表均按categoryID排序，希望可以使此操作更容易。

我想做的是拿出一個新列表C，它由listB中與listA至少有一個交集的項目組成。 因此，列表C應該包含來自上面給定輸入的項[1, 2, 3, 4, 7] 。

到目前為止，我的策略是遍歷兩個列表。 我不認為這是執行此操作的最有效方法，因此我想問一下我可以考慮的其他替代方法是什么。

我的方法：

ArrayList<classB> results = new ArrayList<classB>();
for (classA itemA : listA){
  int categoryID = item.categoryID;
  for (classB itemB : listB){
    if (itemB.categoryID == categoryID){
      if (!results.contains(itemB)){
        results.add(itemB);
      }
      break;
    }
  }
}

我首先遍歷列表A，獲取categoryID，然后遍歷listB以找到匹配的categoryID。 找到它后，我檢查結果列表是否包含listB中的此項。 如果沒有，那么我將其添加到結果中並脫離內部for循環並繼續遍歷listA。 如果結果列表已經包含itemB，那么我將簡單地跳出內部for循環並繼續遍歷listA。 此方法為O（n ^ 2），對於大型數據集來說不是很好。 有什么需要改進的想法嗎？

Answer 1

將ListA中的所有categoryID添加到Set ，我們將其setACategories 。 然后，遍歷ListB，如果setACategories包含setACategories中某個元素的categoryID，則將setACategories的該元素添加到results 。

results也應該是一個Set ，因為看起來您只希望listB中的一個匹配項進入results而不是多個匹配項（允許您避免調用(!results.contains(itemB)) 。

Answer 2

將listA中的categoryID值添加到Set ，然后遍歷listB，選擇categoryId在您的集合中的那些元素。

Answer 3

現在最好的方法是使用Java流：

List<foo> list1 = new ArrayList<>(Arrays.asList(new foo(), new foo()));
List<foo> list2 = new ArrayList<>(Arrays.asList(new foo(), new foo()));
list1.stream().filter(f -> list2.contains(f)).collect(Collectors.toList());

但是，我自己將apache commons庫用於此類工作：

https://commons.apache.org/proper/commons-collections/javadocs/api-3.2.1/org/apache/commons/collections/CollectionUtils.html

Answer 4

你有沒有嘗試過：

public void test() {
    Collection c1 = new ArrayList();
    Collection c2 = new ArrayList();

    c1.add("Text 1");
    c1.add("Text 2");
    c1.add("Text 3");
    c1.add("Text 4");
    c1.add("Text 5");

    c2.add("Text 3");
    c2.add("Text 4");
    c2.add("Text 5");
    c2.add("Text 6");
    c2.add("Text 7");

    c1.retainAll(c2);

    for (Iterator iterator = c1.iterator(); iterator.hasNext();) {
        Object next = iterator.next();
        System.out.println(next);  //Output: Text 3, Text 4, Text 5
    }
}

Answer 5

嘗試使用Google Guava提供的 Sets.intersection(Set<E> set1,Set<?> set2) 。

當然，您可以使用Sets.newHashSet(Iterable<? extends E> elements)將數組轉換為集合

Answer 6

請參閱以下代碼。 我實現了一個交叉點，該交叉點使用了以下事實：對它們進行排序以改進最佳答案的方法。

這種工作類似於合並排序中的合並步驟，但它可以確保交叉。 我可能會在30分鍾內完成編寫，因此可能需要進一步改進。

使用當前數據，其運行速度比最佳答案快17倍。 由於只需要一組，它還節省了O（n）內存。

另請參見：兩個排序數組的交集

import java.util.*;

public class test {
    public static void main (String[] args) {
        List<Integer> a1 = new ArrayList<Integer>();
        List<Integer> a2 = new ArrayList<Integer>();
        Random r = new Random();

        for(int i = 0; i < 1000000; i++) {
            a1.add(r.nextInt(1000000));
            a2.add(r.nextInt(1000000));
        }

        Collections.sort(a1);
        Collections.sort(a2);

        System.out.println("Starting");

        long t1 = System.currentTimeMillis();
        Set<Integer> set1 = func1(a1, a2);
        long t2 = System.currentTimeMillis();

        System.out.println("Func1 done in: " + (t2-t1) + " milliseconds.");

        long t3 = System.currentTimeMillis();
        Set<Integer> set2 = func2(a1, a2);
        long t4 = System.currentTimeMillis();

        System.out.println("Func2 done in: " + (t4-t3) + " milliseconds.");

        if(set1.size() != set2.size()) {
            System.out.println("ERROR - sizes not equal");
            System.exit(1);
        }

        for(Integer t : set1) {
            if (!set2.contains(t)) {
                System.out.println("ERROR");
                System.exit(1);
            }
        }
    }

    public static Set<Integer> func1(List<Integer> a1, List<Integer> a2) {
        Set<Integer> intersection = new HashSet<Integer>();

        int index = 0;
        for(Integer a : a1) {

            while( index < a2.size() && a2.get(index) < a) {
                index++;
            } 

            if(index == a2.size()) { 
                break;
            }
            if (a2.get(index).equals(a)) {
                intersection.add(a);
            } else {
                continue;
            }

        }

        return intersection;
    }

    public static Set<Integer> func2(List<Integer> a1, List<Integer> a2) {
        Set<Integer> intersection = new HashSet<Integer>();
        Set<Integer> tempSet = new HashSet<Integer>();
        for(Integer a : a1) {
            tempSet.add(a);
        }

        for(Integer b : a2) {
            if(tempSet.contains(b)) {
                intersection.add(b);
            }
        }

        return intersection;
    }
}

從兩個不同的ArrayList查找唯一交點的最有效方法？

問題描述

6 個解決方案

解決方案1
3 已采納 2015-01-28 15:46:00

解決方案2
1 2015-01-28 15:47:42

解決方案3
1 2015-01-28 15:48:11

解決方案4
1 2015-01-28 16:01:46

解決方案5
0 2015-01-28 15:48:53

解決方案6
0 2015-01-28 17:16:45

從兩個不同的ArrayList查找唯一交點的最有效方法？

問題描述

6 個解決方案

解決方案1 3 已采納 2015-01-28 15:46:00

解決方案2 1 2015-01-28 15:47:42

解決方案3 1 2015-01-28 15:48:11

解決方案4 1 2015-01-28 16:01:46

解決方案5 0 2015-01-28 15:48:53

解決方案6 0 2015-01-28 17:16:45

解決方案1
3 已采納 2015-01-28 15:46:00

解決方案2
1 2015-01-28 15:47:42

解決方案3
1 2015-01-28 15:48:11

解決方案4
1 2015-01-28 16:01:46

解決方案5
0 2015-01-28 15:48:53

解決方案6
0 2015-01-28 17:16:45