The difference between `str` and `pickle` serialization methods for dictionary values

Question

If I want to save a Dictionary structure to a file and read this Dictionary from the file directly later, I have two methods but I do not know the differences between the two methods. Could anyone explain it? Here is a simple example. Suppose this is my dictionary:

D = {'zzz':123,
     'lzh':321,
     'cyl':333}

The first method to save it to the file:

with open('tDF.txt','w') as f: # save
   f.write(str(D) + '\n')
with open('tDf.txt','r') as f:
   Data = f.read() # read. Data is string
Data = eval(Data) # convert to Dictionary structure format

The second method (using pickle):

import pickle
with open('tDF.txt','w') as f: # save
   pickle.dump(D,f)
with open('tDF.txt','r') as f:
   D = pickle.load(f) # D is Dictionary structure format

I think the first method is much simple. What is the differences?

Thanks!

Answer 1

Writing `str` value representation

If you writhe str value of your data, you rely on the fact, it is properly shaped.

In some cases (eg float numbers, but also more complex objects) you would loose some precision or information.

Using repr instead of str might improve the situation a bit, as repr is supposed to provide the text in a form, which is likely to be working in case of reading it back (but without any guarantee)

Writing `pickle` d data

Pickle is taking care about every bit, so you will have serialized precise information.

This is quite significant difference.

Using other serialization methods

Personally, I prefer serializing into json or sometime yaml , as this form of data is well readable, portable and can be even edited.

Serialize to JSON

For json it works this way:

import json
data = {"a", "aha", "b": "bebe", age: 123, num: 3.1415}
with open("data.json", "w") as f:
    json.dump(data, f)


with open("data.json", "r") as f:
    readdata = json.load(data, f)

print readdata

Serialize to YAML

With YAML:

Firt be sure, you have some YAML lib installed, eg:

$ pip install pyyaml

Personally, I have it installed all the time, as I use it very often.

Then, then script changes only a bit:

import yaml
data = {"a", "aha", "b": "bebe", age: 123, num: 3.1415}
with open("data.yaml", "w") as f:
    yaml.dump(data, f)


with open("data.yaml", "r") as f:
    readdata = yaml.load(data, f)

print readdata

Conclusions

For rather simple data types, the methods described above works easily.

In case you start using instances of classes you have defined, it would require proper definition of loaders and serializers for given formats. Describing this is out of scope of this question, but it is definitely possible for all cases, where some solution exists (as there are types of values, which are not possible to serialize reliably, like file pointers, database connections etc.)

The difference between `str` and `pickle` serialization methods for dictionary values

Question

1 answers

solution1
1 ACCPTED 2014-07-18 09:05:12

Writing `str` value representation

Writing `pickle` d data

Using other serialization methods

Serialize to JSON

Serialize to YAML

Conclusions

The difference between `str` and `pickle` serialization methods for dictionary values

Question

1 answers

solution1 1 ACCPTED 2014-07-18 09:05:12

Writing str value representation

Writing pickle d data

Using other serialization methods

Serialize to JSON

Serialize to YAML

Conclusions

solution1
1 ACCPTED 2014-07-18 09:05:12

Writing `str` value representation

Writing `pickle` d data