简体   繁体   English

从一组嵌套字典中创建键值对列表的大多数pythonic和最快的方法?

[英]Most pythonic and fastest way to create a list of key value pairs from a set of nested dictionaries?

I have come up with the following solution, but it was quite ugly (see original solution). 我提出了以下解决方案,但它非常难看(参见原始解决方案)。 I'm fairly happy with the revised solution. 我对修改后的解决方案非常满意。 Anybody have a cleaner / faster way to accomplish the same output? 有人用更干净/更快的方法来完成相同的输出吗?

Other requirements: 其他需求:

  • Must accept any value and return a list of key value pairs. 必须接受任何值并返回键值对列表。
  • The final key must track the list of keys to access the value with dot syntax. 最后一个键必须跟踪用于使用点语法访问值的键列表。
  • must return a list of key value pairs or a dictionary. 必须返回键值对或字典列表。
  • must remove leading . 必须删除领先. when no base_key is supplied. 什么时候没有提供base_key。

My revised solution: 我的修订解决方案

def create_nested_kvl(v, base_key=None):
    kvl = []
    if not isinstance(v, dict):
        kvl.append((base_key,v))
    else:
        def iterate(v, k):
            for ki, vi in v.items():
                ki = '%s.%s' % (k, ki) if k else ki
                iterate(vi, ki) if isinstance(vi, dict) else kvl.append((ki, vi))
        iterate(v, base_key)
    return kvl

My Original Solution: 我原来的解决方案

def create_nested_kvl(v, base_key=''):
    """ Creates a list of dot syntax key value pairs from a nested dictionary.
    :param      v: The value suspected to be a nested dictionary.
    :param      k: Base key
    :return:    [(k,v)]
    :rtype:     list
    """
    if not isinstance(v, dict):
        return [(base_key,v)]

    kvl = []
    def iterate(v, k):
        for kd, vd in v.items():
            v = vd
            kd = '%s.%s' % (k, kd) if k else kd
            kvl.append((kd, v))

    iterate(v, base_key)
    for k, v in kvl:
        if isinstance(v, dict):
            iterate(v, k)
            kvl.remove((k,v))
    return kvl

input: 输入:

v = {'type1':'type1_val',
     'type2':'type2_val',
     'object': {
          'k1': 'val1',
          'k2': 'val2',
          'k3': {'k31': {
                     'k311': 'val311',
                     'k322': 'val322',
                     'k333': 'val333'
                     },
                'k32': 'val32',
                'k33': 'val33'}}}

create_nested_kvl(v, 'base')

output: 输出:

[('base.type1', 'type1_val'),
 ('base.type2', 'type2_val'),
 ('base.object.k2', 'val2'),
 ('base.object.k1', 'val1'),
 ('base.object.k3.k33', 'val33'),
 ('base.object.k3.k32', 'val32'),
 ('base.object.k3.k31.k311', 'val311'),
 ('base.object.k3.k31.k333', 'val333'),
 ('base.object.k3.k31.k322', 'val322')]

Notes: 笔记:

  • The generator solution presented by Alex Martelli is very slick. Alex Martelli提出的发电机解决方案非常灵活。 Unfortunately, it appears to be a tad slower than my first and revised solution. 不幸的是,它似乎比我的第一个和修订后的解决方案慢一点。 Also, it returns a generator which still needs to be converted to a list or poof, its gone. 此外,它返回一个仍然需要转换为列表或poof的生成器,它已经消失了。

timeit results @ number=1000000: timeit结果@ number = 1000000:

generator : 0.911420848311 (see alex's answer)
original  : 0.720069713321
revised   : 0.660259814902

best      : 0.660259814902 
* as Alex pointed out, my late night rounding skills are horrific.
It's 27% faster not twice as fast (my bad).

Apart from ordering of keys in dicts being arbitrary, and the possible need to trim leading . 除了dicts中的键的排序是任意的,并且可能需要修剪前导. s if that's needed for empty keys (spec unclear): 如果空键需要(规格不清楚):

def create_nested_kvl(v, k=''):
    if isinstance(v, dict):
        for tk in v:
            for sk, sv in create_nested_kvl(v[tk], tk):
                yield '{}.{}'.format(k, sk), sv
    else:
        yield k, v

seems nice and compact. 看起来很好,很紧凑。 Eg: 例如:

v = {'type1':'type1_val',
     'type2':'type2_val',
     'object': {
          'k1': 'val1',
          'k2': 'val2',
          'k3': {'k31': {
                     'k311': 'val311',
                     'k322': 'val322',
                     'k333': 'val333'
                     },
                'k32': 'val32',
                'k33': 'val33'}}}

import pprint
pprint.pprint(list(create_nested_kvl(v, 'base')))

emits 发射

[('base.object.k3.k31.k311', 'val311'),
 ('base.object.k3.k31.k333', 'val333'),
 ('base.object.k3.k31.k322', 'val322'),
 ('base.object.k3.k33', 'val33'),
 ('base.object.k3.k32', 'val32'),
 ('base.object.k2', 'val2'),
 ('base.object.k1', 'val1'),
 ('base.type1', 'type1_val'),
 ('base.type2', 'type2_val')]

as required. 按要求。

Added: in Python, "fast" and "elegant" often coincide -- but not always so. 补充:在Python中,“快速”和“优雅”经常重合 - 但并非总是如此。 In particular, recursion is slightly slower and so are lookups of globals in loop. 特别是,递归稍慢,循环中全局变量的查找也是如此。 So, here, pulling all the usual tricks for recursion elimination w/an explicit stack, and lookup hoisting, one can get...: 所以,在这里,通过显式堆栈提取所有常规的递归消除技巧,并查找提升,可以得到......:

def faster(v, k='', isinstance=isinstance):
    stack = [(k, v)]
    result = []
    push, pop = stack.append, stack.pop
    resadd = result.append
    fmt = '{}.{}'.format
    while stack:
        k, v = pop()
        if isinstance(v, dict):
            for tk, vtk in v.iteritems():
                push((fmt(k, tk), vtk))
        else:
            resadd((k, v))
    return result

...definitely not as elegant, but... on my laptop, my original version, plus a list() at the end, takes 21.5 microseconds on the given sample v ; ...绝对不是那么优雅,但是...在我的笔记本电脑上,我的原始版本,加上最后的一个list() ,在给定的样本v上花费21.5微秒; this faster version takes 16.8 microseconds. 这个更快的版本需要16.8微秒。 If saving those 4.7 microseconds (or, expressed more meaningfully, 22% of the original runtime) is more important than clarity and maintainability, then one can pick the second version and get the same results (net as usual of ordering) that much faster. 如果保存那些4.7微秒(或者,更有意义地表示,原始运行时的22%)比清晰度和可维护性更重要,那么可以选择第二个版本并获得相同的结果(与通常的订购一样),这要快得多。

The OP's "revised version" is still faster on the sample v , partly because formatting with % is slightly faster in Python 2 than the more elegant format , and partly because items is slightly faster (again, Python 2 only) than iteritems ; OP的“修订版本”在样本v上仍然更快,部分原因是因为在Python 2中使用%格式比在更优雅的format中稍微快一些,部分原因是itemsiteritems稍微快一点(仅限Python 2); and some hoisting might further shave some nanoseconds off that one, too. 并且一些提升可能会进一步削减一些纳秒。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 Python 中的字典列表中提取多个键/值对的最有效方法? - Most efficient way to extract multiple key/value pairs from a list of dictionaries of dictionaries in Python? 处理嵌套字典列表的最 Pythonic 方法是什么? - What's the most Pythonic way to deal with a list of nested dictionaries? 基于另一本字典的 bool 创建字典列表的最 Pythonic 方法 - most pythonic way to create list of dictionaries based on bool of another dictionary 更新相同字典列表中的字典值的最pythonic方法? - Most pythonic way to update a dictionary value in a list of same dictionaries? 以大多数pythonic方式从词典列表中生成配置样式的文件 - generating config styled file from list of dictionaries in most pythonic way 大多数pythonic(和有效)的方式将列表成对嵌套 - Most pythonic (and efficient) way of nesting a list in pairs 从键值对列表中创建任意嵌套的字典 - Create arbitrarily nested dictionary from list of key, value pairs 有没有更好的方法来动态地从字典列表中创建具有键和值的字典? - Is there a better way to create a dictionary with key and value from a list of dictionaries dynamically? 将列表中的值用作另一个列表的索引的最pythonic方法 - most pythonic way to use a value from a list as an index to another list 根据另一个字典列表更改字典列表中的键值的最快方法 - Fastest way to change a key value in list of dictionaries base on another list of dictionaries
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM