简体   繁体   English

使用 Python,我可以保留一个持久字典并修改它吗?

[英]With Python, can I keep a persistent dictionary and modify it?

So, I want to store a dictionary in a persistent file.所以,我想将字典存储在一个持久文件中。 Is there a way to use regular dictionary methods to add, print, or delete entries from the dictionary in that file?有没有办法使用常规字典方法在该文件的字典中添加、打印或删除条目?

It seems that I would be able to use cPickle to store the dictionary and load it, but I'm not sure where to take it from there.似乎我可以使用 cPickle 来存储字典并加载它,但我不确定从那里取出它。

如果您的键(不一定是值)是字符串,那么shelve标准库模块可以无缝地执行您想要的操作。

Use JSON使用 JSON

Similar to Pete's answer, I like using JSON because it maps very well to python data structures and is very readable:与 Pete 的回答类似,我喜欢使用 JSON,因为它可以很好地映射到 python 数据结构并且非常易读:

Persisting data is trivial:持久化数据是微不足道的:

>>> import json
>>> db = {'hello': 123, 'foo': [1,2,3,4,5,6], 'bar': {'a': 0, 'b':9}}
>>> fh = open("db.json", 'w')
>>> json.dump(db, fh)

and loading it is about the same:并加载它大致相同:

>>> import json
>>> fh = open("db.json", 'r')
>>> db = json.load(fh)
>>> db
{'hello': 123, 'bar': {'a': 0, 'b': 9}, 'foo': [1, 2, 3, 4, 5, 6]}
>>> del new_db['foo'][3]
>>> new_db['foo']
[1, 2, 3, 5, 6]

In addition, JSON loading doesn't suffer from the same security issues that shelve and pickle do, although IIRC it is slower than pickle.此外,JSON 加载不会遇到与shelvepickle相同的安全问题,尽管 IIRC 比泡菜慢。

If you want to write on every operation:如果你想写在每个操作上:

If you want to save on every operation, you can subclass the Python dict object:如果你想保存每个操作,你可以子类化 Python dict 对象:

import os
import json

class DictPersistJSON(dict):
    def __init__(self, filename, *args, **kwargs):
        self.filename = filename
        self._load();
        self.update(*args, **kwargs)

    def _load(self):
        if os.path.isfile(self.filename) 
           and os.path.getsize(self.filename) > 0:
            with open(self.filename, 'r') as fh:
                self.update(json.load(fh))

    def _dump(self):
        with open(self.filename, 'w') as fh:
            json.dump(self, fh)

    def __getitem__(self, key):
        return dict.__getitem__(self, key)

    def __setitem__(self, key, val):
        dict.__setitem__(self, key, val)
        self._dump()

    def __repr__(self):
        dictrepr = dict.__repr__(self)
        return '%s(%s)' % (type(self).__name__, dictrepr)

    def update(self, *args, **kwargs):
        for k, v in dict(*args, **kwargs).items():
            self[k] = v
        self._dump()

Which you can use like this:你可以这样使用:

db = DictPersistJSON("db.json")
db["foo"] = "bar" # Will trigger a write

Which is woefully inefficient, but can get you off the ground quickly.这是非常低效的,但可以让您快速起步。

Unpickle from file when program loads, modify as a normal dictionary in memory while program is running, pickle to file when program exits?程序加载时从文件中解压,在程序运行时修改为内存中的普通字典,程序退出时从文件中解压到文件? Not sure exactly what more you're asking for here.不确定您在这里还要求什么。

如果仅使用字符串作为键( shelve模块允许)还不够,则FileDict可能是解决此问题的好方法。

Assuming the keys and values have working implementations of repr , one solution is that you save the string representation of the dictionary ( repr(dict) ) to file.假设键和值具有repr工作实现,一种解决方案是将字典的字符串表示形式 ( repr(dict) ) 保存到文件中。 YOu can load it using the eval function ( eval(inputstring) ).您可以使用eval函数( eval(inputstring) )加载它。 There are two main disadvantages of this technique:这种技术有两个主要缺点:

1) Is will not work with types that have an unuseable implementation of repr (or may even seem to work, but fail). 1) Is 不适用于具有无法使用的 repr 实现的类型(甚至可能看起来工作,但失败)。 You'll need to pay at least some attention to what is going on.你至少需要对正在发生的事情给予一些关注。

2) Your file-load mechanism is basically straight-out executing Python code. 2)您的文件加载机制基本上是直接执行 Python 代码。 Not great for security unless you fully control the input.除非您完全控制输入,否则安全性不佳。

It has 1 advantage: Absurdly easy to do.它有 1 个优点:非常容易做到。

My favorite method (which does not use standard python dictionary functions): Read/write YAML files using PyYaml .我最喜欢的方法(不使用标准的 Python 字典函数):使用PyYaml读/写 YAML 文件。 See this answer for details , summarized here: 有关详细信息请参阅此答案,总结如下:

Create a YAML file, "employment.yml":创建一个 YAML 文件,“employment.yml”:

new jersey:
  mercer county:
    pumbers: 3
    programmers: 81
  middlesex county:
    salesmen: 62
    programmers: 81
new york:
  queens county:
    plumbers: 9
    salesmen: 36

Step 3: Read it in Python第 3 步:用 Python 阅读

import yaml
file_handle = open("employment.yml")
my__dictionary = yaml.safe_load(file_handle)
file_handle.close()

and now my__dictionary has all the values.现在 my__dictionary 拥有所有值。 If you needed to do this on the fly, create a string containing YAML and parse it wth yaml.safe_load.如果您需要即时执行此操作,请创建一个包含 YAML 的字符串并使用 yaml.safe_load 对其进行解析。

pickling has one disadvantage.酸洗有一个缺点。 it can be expensive if your dictionary has to be read and written frequently from disk and it's large.如果您的字典必须频繁地从磁盘读取和写入并且它很大,则可能会很昂贵。 pickle dumps the stuff down (whole).泡菜将东西倾倒(整个)。 unpickle gets the stuff up (as a whole). unpickle 把东西弄起来(作为一个整体)。

if you have to handle small dicts, pickle is ok.如果你必须处理小字典,泡菜是可以的。 If you are going to work with something more complex, go for berkelydb.如果您要处理更复杂的事情,请选择 berkelydb。 It is basically made to store key:value pairs.它基本上用于存储键:值对。

Have you considered using dbm?你考虑过使用dbm吗?

import dbm
import pandas as pd
import numpy as np
db = b=dbm.open('mydbm.db','n')

#create some data
df1 = pd.DataFrame(np.random.randint(0, 100, size=(15, 4)), columns=list('ABCD'))
df2 = pd.DataFrame(np.random.randint(101,200, size=(10, 3)), columns=list('EFG'))

#serialize the data and put in the the db dictionary
db['df1']=df1.to_json()
db['df2']=df2.to_json()


# in some other process:
db=dbm.open('mydbm.db','r')
df1a = pd.read_json(db['df1'])
df2a = pd.read_json(db['df2'])

This tends to work even without a db.close()即使没有 db.close() 这也往往会起作用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM