简体   繁体   中英

Binary search not detecting duplicates?

I have an array of items, cards, all with String names, so

Card c1= new card("TheCard")

Card c2= new card("TheOtherCard")

And then I am using a quicksort to sort the list and then trying a binary search to see if cards already exist before adding more

So,

if(cards.contains(c3)==true)

//do nothing

else

cards.add(c3)

And my cards.contains method is

Comparator<Card> c = new Comparator<Card>() {    
    @Override
    public int compare(Card u1, Card u2) { 
        return u1.getName().compareTo(u2.getName()); 
    } 
};
int index;
index = Collections.binarySearch(cards, it, c);
if (index == -1) {
    return false;
} else {
    return true;
}

But the problem is that it's searching the cards array, finding cards that aren't in the list and saying they are and saying cards that are in the list aren't

I am trying to add 10,000 cards, 8,000 of them being unique, but the contains method is returning 2,000 unique cards and when I check the list, they're not even unique https://i.imgur.com/N9kQtms.png

I've tried running the code un-sorted and that just returns about 4,000 results with the same problem of repeating cards, when I brute force and just use the base .contains, that works but it is super slow

(Also sorry if I messed up something in my post, it is my first time posting here)

The javadoc states the following:

Searches the specified list for the specified object using the binary search algorithm. The list must be sorted into ascending order according to the specified comparator (as by the sort(List, Comparator) method), prior to making this call. If it is not sorted, the results are undefined. If the list contains multiple elements equal to the specified object, there is no guarantee which one will be found.

It also states that it returns:

the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1). The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size() if all elements in the list are less than the specified key. Note that this guarantees that the return value will be >= 0 if and only if the key is found.

Your list should be therefore sorted beforehand or it won't return anything that make sense. Then you, it does return either the index or the insertion point of the element. Beware of this technicality. You should check after execution that the element at the index is in fact the correct one and not only the index at which you would insert your element it .

There you could have this test to see if it is your card:

// Test if the card at the index found has got the same name than the card you are actually looking for.
return !index == cards.length && cards[index].getName().equals(it.getName()));

You could also override equals to have something that is closer to:

return !index == cards.length && cards[index].equals(it);

In both case, we ensure that we won't have an ArrayOutOfBoundException if the insertion point is at the end of the list.

The binarySearch gives a non-negative index when it finds an item.

It gives the complement of the insert position: ~index == -index-1 when it is not found.

  • Search d in abde gives 2.
  • Search d in abeg gives ~2 == -3, the insert position being 2.

So the check is:

int index = Collections.binarySearch(cards, it, c);
return index >= 0;

Furthermore Card should have a correct equality:

public class Card implements Comparable<Card> {

    ...

    @Override
    public int compareTo(Card other) {
        return name.compareTo(other.name);
    }

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceOf Card)) {
            return false;
        }
        Card other = (Card) obj;
        return name.equals(other.name);
    }

    @Override
    public int hashCode() {
        return name.hashCode();
    }
}

In this case instead of a Comparator you can implement Comparable<Card> as the name is the read identification of a card. Comparator is more for sorting persons on last name + first name, or first name + last name, or on city.

The hashCode allows using HashMap<Card, ...> .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM