I'm trying to unpickle chunks of a webpage stored in the Google App Engine memcache. First I get the chunks and store them as a dictionary with the key
def get_by_key_name(key_name):
result = memcache.get_multi(['%s.%s' % (key_name, i) for i in xrange(32)])
serialized = ''
for k, v in sorted(result.items()):
if v is not None:
serialized = serialized.join(v)
else:
return None
return pickle.loads(serialized) #Line that fails
For some reason it raises EOFError. The code that originally pickled the data is:
serialized = pickle.dumps(content, 2)
values = {}
for i in xrange(0, len(serialized), chunksize):
values['%s.%s' % (key_name, i//CHUNKSIZE) ] = serialized[i:i+chunksize]
Anybody have any idea why? By the way, CHUNKSIZE is 950000 bytes. I tried to load reddit's front page onto the memcache, so I don't think it is exceeding this limit.
You want to concatenate the string, not join.
serialized += v
Join will add a copy of the original string between each character of the new string
>>> 'hello'.join('there')
'thellohhelloehellorhelloe'
I'm kinda impressed you didn't run out of memory!
You are joining your string incorrectly:
serialized = ''
for k, v in sorted(result.items()):
if v is not None:
serialized = serialized.join(v)
This uses selialized
as built so far as the joining string, with the new string treated as individual characters:
>>> serialized = ''
>>> for v in ('foo', 'bar', 'baz'):
... serialized = serialized.join(v)
...
>>> serialized
'bbfooafoorabfooafoorz'
where 'foo'.join('bar')
produced 'bfooafoor'
, which then was used to join the characters of baz
.
Build a list , then return that:
if None in result.viewvalues():
# one or more keys came back empty, abort
return
serialized = ''.join([v for k, v in sorted(result.items())])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.