Let's say I have a class called Symbol. At any given time, I want one and only one copy of a Symbol with a given id. For example
registry = {}
class Symbol(object):
def __init__(self, id):
self.id = id
def __eq__(self, other):
return self is other
def symbol(id):
if id not in registry:
registry[id] = Symbol(id)
return registry[id]
I'd like to be able to pickle my Symbol object, but I can't figure how to get cPickle call my symbol factory. Now I could just implement the getstate/setstate overrides, but that would still not merge unpickled objects with the ones already existing in the registry. How to pickle the above class while preserving the 1:1 ratio of Symbols to IDs?
Edit (updated title to state "interned" instead of "singleton"):
Let me explain the use case. We're using these Symbols as keys in dicts. Having them be interned drastically improves performance
What I need to have happen:
x = symbol("x")
y = pickle.loads(pickle.dumps(x))
x is y == True
Since you don't want more than one object with a given id, provide a custom __new__
method in place of your symbol
function.
class Symbol(object):
registry = {}
def __new__(cls, *args, **kwargs):
id_ = args[0]
return Symbol.registry.setdefault(_id, object.__new__(cls, *args, **kwargs))
def __init__(self, id):
self.id = id
Now you don't need a factory function to create Symbol
objects.
$ a = Symbol('=')
$ b = Symbol('=')
$ a is b
True
You may want to use weakref 's WeakValueDictionary for the registry of symbols, so that garbage collection can reclaim the memory when the Symbol are not referenced anymore.
You could use the following class to define what an interned object is. Your Symbol class (or any other class), can then inherit from it.
class Interned (object):
# you need to create this registry in each class if the keys are not unique system-wide
registry = weakref.WeakValueDictionary()
def __new__(cls, *args, **kwargs):
assert 0 < len(args)
if not args[0] in cls.registry: # don't use setdefault to avoid creating unnecessary objects
o = object.__new__(cls, *args, **kwargs) # o is a ref needed to avoid garbage collection within this call
cls.registry[args[0]] = o
return o
return cls.registry[args[0]]
def __eq__(self, other):
return self is other
def __hash__(self):
# needed by python 3
return id(self)
def __ne__(self, other):
return not self is other
Your code becomes :
class Symbol(Interned):
def __init__(self, id):
self.id = id
Resulting in:
$ a = Symbol('=')
$ b = Symbol('=')
$ a is b
True
You can try so subclass pickle.Unpickler
and implement your loading logic in the load
method.
But you will need some kind of key to know if the object already exists at runtime (to return a reference rather than a new instance). This will lead you to the reimplementation of the python object space.
I would recommand trying to find another data structure more suited to your actual problem.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.