I'm building a class with amongst others a dictionary with integer keys and list values. Adding values to this dictionary seems to be a real bottleneck though and I was wondering whether there might be some way to speed up my code.
class myClass():
def __init__(self):
self.d = defaultdict(list)
def addValue(self, index, value):
self.d[index].append(value)
Is this really the optimal way of doing this? I don't really care about the order of the values, so perhaps there is a more suitable data structure out there with a faster append. Then again, 'append' doesn't seem to be the main problem, because if I simply append to an empty list, the code is a lot faster. I guess it's the loading of the previously stored list that takes up most of the time?
I found out that the problem is not in the dict, but in the list append (although I claimed otherwise in my original post, for which I apologize). This problem is due to a bug in Python's garbage collector, which is well explained on this other question . Disabling the gc before adding all the values and then re-enabling it, speeds up the process immensely!
Compare it to this:
class myClass():
def __init__(self):
self.d = {}
def addValue(self, index, value):
self.d.setdefault(index, []).append(value)
They say "Better to ask for forgiveness than for permission.". Now you're not asking for permission personally, but I thought maybe defaultdict
does, and that's what slowing it down.
try
this:
class myClass():
def __init__(self):
self.d = {}
def addValue(self, index, value):
try:
self.d[index].append(value)
except KeyError:
self.d[index] = [value]
This tries to access the index
key in the dictionary, if it doesn't exist it will raise a KeyError
, and act upon it.
Is it any faster?
作为结论,我可以说原始问题中的代码比所有其他建议更快或更快。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.