[英]Storing Python dictionaries
I'm used to bringing data in and out of Python using CSV files, but there are obvious challenges to this.我习惯于使用 CSV 文件将数据输入和输出 Python,但这显然存在挑战。 Are there simple ways to store a dictionary (or sets of dictionaries) in a JSON or pickle file?是否有简单的方法将字典(或字典集)存储在 JSON 或泡菜文件中?
For example:例如:
data = {} data ['key1'] = "keyinfo" data ['key2'] = "keyinfo2"
I would like to know both how to save this, and then how to load it back in.我想知道如何保存它,然后如何重新加载它。
try: import cPickle as pickle except ImportError: # Python 3.x import pickle with open('data.p', 'wb') as fp: pickle.dump(data, fp, protocol=pickle.HIGHEST_PROTOCOL)
See the pickle module documentation for additional information regarding the protocol
argument.有关protocol
参数的更多信息,请参阅pickle 模块文档。
with open('data.p', 'rb') as fp: data = pickle.load(fp)
import json with open('data.json', 'w') as fp: json.dump(data, fp)
Supply extra arguments, like sort_keys
or indent
, to get a pretty result.提供额外的 arguments,如sort_keys
或indent
,以获得漂亮的结果。 The argument sort_keys will sort the keys alphabetically and indent will indent your data structure with indent=N
spaces.参数sort_keys将按字母顺序对键进行排序,并且indent将使用indent=N
个空格缩进您的数据结构。
json.dump(data, fp, sort_keys=True, indent=4)
with open('data.json', 'r') as fp: data = json.load(fp)
Minimal example, writing directly to a file:最小的例子,直接写入文件:
import json json.dump(data, open(filename, 'wb')) data = json.load(open(filename))
or safely opening / closing:或安全打开/关闭:
import json with open(filename, 'wb') as outfile: json.dump(data, outfile) with open(filename) as infile: data = json.load(infile)
If you want to save it in a string instead of a file:如果要将其保存在 string 而不是文件中:
import json json_str = json.dumps(data) data = json.loads(json_str)
To write to a file:要写入文件:
import json myfile.write(json.dumps(mydict))
To read from a file:从文件中读取:
import json mydict = json.loads(myfile.read())
myfile
is the file object for the file that you stored the dict in. myfile
是您存储字典的文件的 object 文件。
If you want an alternative to pickle
or json
, you can use klepto
.如果您想要替代pickle
或json
,您可以使用klepto
。
>>> init = {'y': 2, 'x': 1, 'z': 3} >>> import klepto >>> cache = klepto.archives.file_archive('memo', init, serialized=False) >>> cache {'y': 2, 'x': 1, 'z': 3} >>> >>> # dump dictionary to the file 'memo.py' >>> cache.dump() >>> >>> # import from 'memo.py' >>> from memo import memo >>> print memo {'y': 2, 'x': 1, 'z': 3}
With klepto
, if you had used serialized=True
, the dictionary would have been written to memo.pkl
as a pickled dictionary instead of with clear text.使用klepto
,如果您使用了serialized=True
,则字典将作为腌制字典而不是明文写入memo.pkl
。
You can get klepto
here: https://github.com/uqfoundation/klepto你可以在这里得到klepto
: https://github.com/uqfoundation/klepto
dill
is probably a better choice for pickling then pickle
itself, as dill
can serialize almost anything in python. dill
可能是 pickle 比pickle
本身更好的选择,因为dill
几乎可以序列化 python 中的任何内容。 klepto
also can use dill
. klepto
也可以使用dill
。
You can get dill
here: https://github.com/uqfoundation/dill你可以在这里得到dill
: https://github.com/uqfoundation/dill
The additional mumbo-jumbo on the first few lines are because klepto
can be configured to store dictionaries to a file, to a directory context, or to a SQL database.前几行中的额外大字是因为klepto
可以配置为将字典存储到文件、目录上下文或 SQL 数据库中。 The API is the same for whatever you choose as the backend archive.无论您选择什么作为后端存档,API 都是相同的。 It gives you an "archivable" dictionary with which you can use load
and dump
to interact with the archive.它为您提供了一个“可存档”字典,您可以使用它来使用load
和dump
与存档进行交互。
If you're after serialization, but won't need the data in other programs, I strongly recommend the shelve
module.如果你在序列化之后,但不需要其他程序中的数据,我强烈推荐shelve
模块。 Think of it as a persistent dictionary.把它想象成一个持久的字典。
myData = shelve.open('/path/to/file') # Check for values. keyVar in myData # Set values myData[anotherKey] = someValue # Save the data for future use. myData.close()
For completeness, we should include ConfigParser and configparser which are part of the standard library in Python 2 and 3, respectively.为了完整起见,我们应该分别在 Python 2 和 3 中包含 ConfigParser 和 configparser,它们是标准库的一部分。 This module reads and writes to a config/ini file and (at least in Python 3) behaves in a lot of ways like a dictionary.该模块读取和写入 config/ini 文件,并且(至少在 Python 3 中)在很多方面表现得像字典。 It has the added benefit that you can store multiple dictionaries into separate sections of your config/ini file and recall them.它还有一个额外的好处,您可以将多个字典存储到 config/ini 文件的单独部分中并调用它们。 Sweet甜的
Python 2.7.x example. Python 2.7.x 示例。
import ConfigParser config = ConfigParser.ConfigParser() dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'} dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'} dict3 = {'x':1, 'y':2, 'z':3} # Make each dictionary a separate section in the configuration config.add_section('dict1') for key in dict1.keys(): config.set('dict1', key, dict1[key]) config.add_section('dict2') for key in dict2.keys(): config.set('dict2', key, dict2[key]) config.add_section('dict3') for key in dict3.keys(): config.set('dict3', key, dict3[key]) # Save the configuration to a file f = open('config.ini', 'w') config.write(f) f.close() # Read the configuration from a file config2 = ConfigParser.ConfigParser() config2.read('config.ini') dictA = {} for item in config2.items('dict1'): dictA[item[0]] = item[1] dictB = {} for item in config2.items('dict2'): dictB[item[0]] = item[1] dictC = {} for item in config2.items('dict3'): dictC[item[0]] = item[1] print(dictA) print(dictB) print(dictC)
Python 3.X example. Python 3.X 示例。
import configparser config = configparser.ConfigParser() dict1 = {'key1':'keyinfo', 'key2':'keyinfo2'} dict2 = {'k1':'hot', 'k2':'cross', 'k3':'buns'} dict3 = {'x':1, 'y':2, 'z':3} # Make each dictionary a separate section in the configuration config['dict1'] = dict1 config['dict2'] = dict2 config['dict3'] = dict3 # Save the configuration to a file f = open('config.ini', 'w') config.write(f) f.close() # Read the configuration from a file config2 = configparser.ConfigParser() config2.read('config.ini') # ConfigParser objects are a lot like dictionaries, but if you really # want a dictionary you can ask it to convert a section to a dictionary dictA = dict(config2['dict1'] ) dictB = dict(config2['dict2'] ) dictC = dict(config2['dict3']) print(dictA) print(dictB) print(dictC)
{'key2': 'keyinfo2', 'key1': 'keyinfo'} {'k1': 'hot', 'k2': 'cross', 'k3': 'buns'} {'z': '3', 'y': '2', 'x': '1'}
[dict1] key2 = keyinfo2 key1 = keyinfo [dict2] k1 = hot k2 = cross k3 = buns [dict3] z = 3 y = 2 x = 1
If save to a JSON file, the best and easiest way of doing this is:如果保存到 JSON 文件,最好和最简单的方法是:
import json with open("file.json", "wb") as f: f.write(json.dumps(dict).encode("utf-8"))
My use case was to save multiple JSON objects to a file and marty's answer helped me somewhat.我的用例是将多个 JSON 对象保存到文件中, marty 的回答对我有所帮助。 But to serve my use case, the answer was not complete as it would overwrite the old data every time a new entry was saved.但是为了服务于我的用例,答案并不完整,因为每次保存新条目时它都会覆盖旧数据。
To save multiple entries in a file, one must check for the old content (ie, read before write).要在一个文件中保存多个条目,必须检查旧内容(即,先读后写)。 A typical file holding JSON data will either have a list
or an object
as root.保存 JSON 数据的典型文件将具有list
或object
作为根。 So I considered that my JSON file always has a list of objects
and every time I add data to it, I simply load the list first, append my new data in it, and dump it back to a writable-only instance of file ( w
):所以我认为我的 JSON 文件总是有一个list of objects
每次我向其中添加数据时,我只需先加载列表,append 我的新数据在其中,然后将其转储回文件的只可写实例( w
):
def saveJson(url,sc): # This function writes the two values to the file newdata = {'url':url,'sc':sc} json_path = "db/file.json" old_list= [] with open(json_path) as myfile: # Read the contents first old_list = json.load(myfile) old_list.append(newdata) with open(json_path,"w") as myfile: # Overwrite the whole content json.dump(old_list, myfile, sort_keys=True, indent=4) return "success"
The new JSON file will look something like this:新的 JSON 文件将如下所示:
[ { "sc": "a11", "url": "www.google.com" }, { "sc": "a12", "url": "www.google.com" }, { "sc": "a13", "url": "www.google.com" } ]
NOTE: It is essential to have a file named file.json
with []
as initial data for this approach to work注意:必须有一个名为file.json
的文件,其中[]
作为此方法工作的初始数据
PS: not related to original question, but this approach could also be further improved by first checking if our entry already exists (based on one or multiple keys) and only then append and save the data. PS:与原始问题无关,但这种方法也可以通过首先检查我们的条目是否已经存在(基于一个或多个键)然后再检查 append 并保存数据来进一步改进。
Shorter code更短的代码
Saving and loading all types of python variables (incl. dictionaries) with one line of code each.使用一行代码保存和加载所有类型的 python 变量(包括字典)。
data = {'key1': 'keyinfo', 'key2': 'keyinfo2'}
saving:保存:
pickle.dump(data, open('path/to/file/data.pickle', 'wb'))
loading:加载:
data_loaded = pickle.load(open('path/to/file/data.pickle', 'rb'))
Maybe it's obvious, but I used the two-row solution in the top answer quite a while before I tried to make it shorter.也许这很明显,但是在尝试缩短它之前,我在最佳答案中使用了两行解决方案很长一段时间。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.