简体   繁体   中英

Writing a pickle.dumps output to a file

I have the following code:

some_dict = {'a':0, 'b':1}
line = "some_dict_b = %s\n" % pickle.dumps(some_dict,2)
exec(line)
decoded_dict = pickle.loads(some_dict_b)
decoded_dict == some_dict

In python 3 this code prints True. In python 2 (2.7.8) I get an error in the exec line. I know dumps returns str in 2.7 while it returns a byte-stream in 3.

I am writing a program that parses data from an input file then creates certain memory objects and should write out a python script that uses these objects. I write these objects in the script file using pickle.dumps() and inserting it into a variable declaration line as per the idea sketched above. But I need to be able to run this code in python 2.

I did notice that in python 3 the line variable gets each backslash properly escaped and a type:

>>> line
"some_dict_b = b'\\x80\\x02...

while in python 2 I get:

>>> line
'some_dict_b = \x80\x02...

The Python 3 bytes type doesn't have a string represention, so when converted to a string with %s , the object representation is used instead. If you wanted to produce Python-compatible syntax from objects, you can use the %r formatter instead, to just use the representation directly.

In Python 2:

>>> import pickle
>>> some_dict = {'a':0, 'b':1}
>>> p = pickle.dumps(some_dict, 2)
>>> print 'string: %s\nrepresentation: %r' % (p, p)
string: ?}q(UaqKUbqKu.
representation: '\x80\x02}q\x00(U\x01aq\x01K\x00U\x01bq\x02K\x01u.'

In Python 3:

>>> import pickle
>>> some_dict = {'a':0, 'b':1}
>>> p = pickle.dumps(some_dict, 2)
>>> print('string: %s\nrepresentation: %r' % (p, p))
string: b'\x80\x02}q\x00(X\x01\x00\x00\x00bq\x01K\x01X\x01\x00\x00\x00aq\x02K\x00u.'
representation: b'\x80\x02}q\x00(X\x01\x00\x00\x00bq\x01K\x01X\x01\x00\x00\x00aq\x02K\x00u.'

Object representations (the output of the repr() function , which uses the object.__repr__ special method ) generally will attempt to provide you with a representation that can be pasted back into a Python script or interactive prompt to recreate the same value.

From the documentation for repr() :

For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval() , otherwise the representation is a string enclosed in angle brackets that contains the name of the type of the object together with additional information often including the name and address of the object.

None of this is specific to pickle , really.

Whenever you think "I use exec", think again. You don't. Instead of evaluating data like this, store the contents of the data inside a dict itself.

Then, assign the data explicit to the variable.

some_dict = {'a':0, 'b':1}
line = pickle.dumps(some_dict)
decoded_dict = pickle.loads(line)
decoded_dict == some_dict

You can call repr on the string or bytes object before inserting them into the line.

# Python 2
>>> 'some_dict = %s' % repr(pickle.dumps(d))
'some_dict = "(dp0\\nS\'a\'\\np1\\nI12\\nsS\'b\'\\np2\\nI24\\ns."'

# Python 3
>>> 'some_dict = %s' % repr(pickle.dumps(d))
"some_dict = b'\\x80\\x03}q\\x00(X\\x01\\x00\\x00\\x00bq\\x01K\\x18X\\x01\\x00\\x00\\x00aq\\x02K\\x0cu.'"

Or use the format method, using !r to automatically call repr :

>>> 'some_dict = {!r}'.format(pickle.dumps(d))
"some_dict = b'\\x80\\x03}q\\x00(X\\x01\\x00\\x00\\x00bq\\x01K\\x18X\\x01\\x00\\x00\\x00aq\\x02K\\x0cu.'"

(Also works in python 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM