简体   繁体   中英

How to use any value as a dictionary key?

I'd like to use instances of any type as a key in a single dict .

def add_to_dict(my_object, d, arbitrary_val = '123'):
    d[ id(my_object) ] = arbitrary_val

d = {}
add_to_dict('my_str', arbitrary_val)
add_to_dict(my_list, arbitrary_val)
add_to_dict(my_int, arbirtray_val)

my_object = myclass()
my_object.__hash__ = None
add_to_dict(my_object, arbitrary_val)

The above won't work because my_list and my_object can't be hashed.

My first thought was to just pass in the id value of the object using the id() function.

def add_to_dict(my_object, d, arbitrary_val = '123'):
    d[ id(my_object) ] = arbitrary_val

However, that won't work because id('some string') == id('some string') is not guaranteed to always be True .

My second thought was to test if the object has the __hash__ attribute. If it does, use the object, otherwise, use the id() value.

def add_to_dict(my_object, d, arbitrary_val = '123'):
    d[ my_object if my_object.__hash__ else id(my_object) ] = arbitrary_val

However, since hash() and id() both return int 's, I believe I will eventually get a collision.

How can I write add_to_dict(obj, d) above to ensure that no matter what obj is ( list , int , str , object , dict ), it will correctly set the item in the dictionary and do so without collision?

We could make some kind of dictionary that allows us to insert mutable objects as well:

class DictionaryMutable:

    nullobject = object()

    def __init__(self):
        self._inner_dic = {}
        self._inner_list = []

    def __getitem__(self, name):
        try:
            return self._inner_dic[name]
        except TypeError:
            for key, val in self._inner_list:
                if name == key:
                    return val
            raise KeyError(name)

    def __setitem__(self, name, value):
        try:
            self._inner_dic[name] = value
        except TypeError:
            for elm in self._inner_list:
                if name == elm[0]:
                    elm[1] = value
                    break
            else:
                self._inner_list.append([name,value])

    # ...

This works as follows: the DictionaryMutable consists out of a dictionary and a list. The dictionary contains the hashable immutable keys, the list contains sublists where each sublist contains two elements: a key and a value.

For each lookup we first attempt to perform a lookup on the dictionary, in case the key name is unhashable, a TypeError will be thrown. In that case we iterate through the list, check if one of the keys matches and return the corresponding value if it does. If no such element exists, we raise a KeyError .

Setting elements works approximately the same way: first we attempt to set the element in the dictionary. If it turns out the key is unhashable, we search linearly through the list and aim to add the element. If that fails, we add it at the end of the list.

This implementation has some major disadvantages:

  • if the dictionary lookup fails due to the key being unhashable, we will perform linear lookup , this can siginificantly slow down the lookup; and
  • if you alter an object that is in the dictionary, then the key will be updated, and thus a search for that object will fail. It thus can result in some unpredicted behavior.

This is only a basic implementation. For instance __iter__ , etc. need to be implemented as well.

Instead of the id() of the object, you could use the pickled byte stream representation of the object pickle.dumps() returns for it. pickle works with most built-in types, and there are ways to extend it to work with most values it doesn't know how to do automatically.

Note: I used the repr() of the object as its "arbitrary value" in an effort to make it easier to identify them in the output displayed.

try:
    import cpickle as pickle
except ModuleNotFoundError:
    import pickle
from pprint import pprint

def add_to_dict(d, obj, arbitrary_val='123'):
    d[pickle.dumps(obj)] = arbitrary_val

class MyClass: pass

my_string = 'spam'
my_list = [13, 'a']
my_int = 42
my_instance = MyClass()

d = {}
add_to_dict(d, my_string, repr(my_string))
add_to_dict(d, my_list, repr(my_list))
add_to_dict(d, my_int, repr(my_int))
add_to_dict(d, my_instance, repr(my_instance))

pprint(d)

Output:

{b'\x80\x03K*.': '42',
 b'\x80\x03X\x04\x00\x00\x00spamq\x00.': "'spam'",
 b'\x80\x03]q\x00(K\rX\x01\x00\x00\x00aq\x01e.': "[13, 'a']",
 b'\x80\x03c__main__\nMyClass\nq\x00)\x81q\x01.': '<__main__.MyClass object at '
                                                  '0x021C1630>'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM