简体   繁体   中英

Removing duplicates from one list by comparing with another list

I have a two lists of objects and I would like to remove the instances from one list which is there in other list.

eg I have following two lists and suppose each letter represents the object.

List listA = {A, B, C , D, E, F, G, H , I , J}

List listB= {D, G, K, P, Z}

Now, clearly listB has D and G which are there on listA too so I want listA to be like this

listA = {A, B, C , E, F, H , I , J}

Can you guys please suggest what would be the solution for this with O(n) or less than O(n2).

I can iterate over both the lists and remove the duplicate instances by comparing but I want to have something more efficient.

If the lists are unsorted, and are ArrayLists or other similar list implementations with an O(n) contains method, then you should create a HashSet with the items of listB in order to perform the removal. If the items are not put into a set then you will end up with O(n^2) performance.

Easiest way to do what you need is thus:

listA.removeAll(new HashSet(listB));

ArrayList.removeAll(Collection) will not put the items into a set for you (at least in the JDK 1.6 and 1.7 versions I checked), which is why you need to create the HashSet yourself in the above.

The removeAll method will copy the items you wish to keep into the beginning of the list as it traverses it, avoiding array compacting with each removal, so using it against a passed in HashSet as shown is reasonably optimal and is O(n).

You could add both list elements to a Set

To remove elements on one list from another, try listA.removeAll(listB);

Like ssantos answered, you can use a Set.

Alternatively, if the lists are sorted, then you can alternately iterate through them. Iterate through ListA until you reach an element that is greater than the current element of ListB, then iterate through ListB until you reach an element that is greater than the current element of ListA, etc.

Following is some pseudo-C to solve in expected time O(n) .

lenA = length pf listA
lenB = length of listB
shortList = (lenA <= lenB) ? A : B
longList  = (shortList == A) ? B : A

create hash table hashTab with elements of shortList

for each element e in longList:  
    is e present in hashTab:
        remove e from longList

now, longList contains the merged duplicate-free elements

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM