简体   繁体   中英

Set subtraction in Python

In my Python code I have this class:

class _Point2D:
    def __init__(self, x, y):
        self.x = x
        self.y = y    

    def __repr__(self):
        return 'point: (' + str(self.x) + ', ' + str(self.y) + ')' 

And there are two lists, initialPointsList and burnedPointsList :

initialPointsList = []
initialPointsList.append(_Point2D(1, 1))
initialPointsList.append(_Point2D(1, 2))
initialPointsList.append(_Point2D(1, 3))
initialPointsList.append(_Point2D(1, 4))
initialPointsList.append(_Point2D(1, 5))
initialPointsList.append(_Point2D(1, 6))
initialPointsList.append(_Point2D(1, 7))

burnedPointsList = []
burnedPointsList.append(_Point2D(1, 2))
burnedPointsList.append(_Point2D(1, 3))

I want to calculate the difference between initialPointsList and burnedPointsList

I have executed:

result = set(initialPointsList) - set(burnedPointsList)
for item in result:
    print item

And get the following output:

point: (1, 1)
point: (1, 4)
point: (1, 5)
point: (1, 6)
point: (1, 2)
point: (1, 3)
point: (1, 7)

But I expected another result, without burned point coordinates:

point: (1, 1)
point: (1, 4)
point: (1, 5)
point: (1, 6)
point: (1, 7)

What is the best way to do that in Python? What is incorrect with my code ?

If you want this to work correctly, you need to define the __eq__() and __hash__() special methods. If you define __eq__() , it's usually also a good idea to define __ne__() .

__eq__() should return True if its arguments are equivalent (their x and y values are the same). __ne__() should do the opposite. It's usually also desirable for __eq__() to do type checking, and return false if the "other" value is not of the same type as self .

__hash__() should return a number. The number should be the same for two values which compare equal with __eq__() , and it's desirable but not strictly required for it to be different for distinct values. A good implementation is this:

def __hash__(self):
    return hash((self.x, self.y))

The tuple hashing algorithm will combine the hash values of its elements in a statistically well-behaved way. You may sometimes see people recommend bitwise XOR (ie self.x ^ self.y ) here, but that isn't a good idea. That technique throws away all the bits they have in common, which makes for inferior hashing performance (eg it always returns zero if self.x == self.y ).

Finally, you need to make sure that hash values don't change after an object has been constructed . This is most easily accomplished by converting self.x and self.y into read-only properties .

For completeness, here would be the __eq__ , __ne__ , and __hash__ methods as mentioned in Kevin's answer.

def __eq__(self, other):
    return type(self) is type(other) and self.x == other.x and self.y == other.y

def __ne__(self, other):
    return not self.__eq__(other)

def __hash__(self):
    return hash((self.x, self.y))

I test it by adding these methods to your class and it produces the expected output:

point: (1, 5)
point: (1, 6)
point: (1, 1)
point: (1, 4)
point: (1, 7)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM