简体   繁体   English

使用另一个词典列表更新词典列表。有更快的方法吗?

[英]Updating list of dictionaries with another list of dictionaries. Is there a faster way?

I have a list of dictionaries that I need to update with information from another list of dictionaries. 我有一个字典列表,我需要使用另一个字典列表中的信息进行更新。 My current solution (below) works by taking every dictionary from the first list and comparing it to every dictionary in the second list. 我当前的解决方案(下面)通过从第一个列表中获取每个字典并将其与第二个列表中的每个字典进行比较来工作。 It works, but is there a faster, more elegant way of achieving the same result? 它有效,但是有更快,更优雅的方式来实现相同的结果吗?

a = [ { "id": 1, "score":200 }, { "id": 2, "score":300 }, { "id":3, "score":400 } ]
b = [ { "id": 1, "newscore":500 }, { "id": 2, "newscore":600 } ]
# update a with data from b
for item in a:
    for replacement in b:
        if item["id"]==replacement["id"]:
            item.update({"score": replacement["newscore"]})

Create a dictionary indexed by id using the first array. 使用第一个数组创建由id索引的字典。 Loop through the second array using the id . 使用id循环遍历第二个数组。

for replacement in b:
   v = lookup.get(replacement['id'], None)
   if v is not None:
      v['score'] = replacement['newscore']

This converts an O(n^2) problem to an O(n) problem. 这将O(n^2)问题转换为O(n)问题。

Instead of doing a len(a) * len(b) loop, process b into something easier to work with: 而不是做一个len(a)* len(b)循环,将b处理成更容易使用的东西:

In [48]: replace = {d["id"]: {"score": d["newscore"]} for d in b}

In [49]: new_a = [{**d, **replace.get(d['id'], {})} for d in a]

In [50]: new_a
Out[50]: [{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

Note that the {**somedict} syntax requires a modern version of Python (>= 3.5.) 请注意, {**somedict}语法需要现代版本的Python(> = 3.5。)

List Comprehension: 列表理解:

[i.update({"score": x["newscore"]}) for x in b for i in a if i['id']==x['id']]
print(a)

Output: 输出:

[{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

Timing: 定时:

%timeit [i.update({"score": x["newscore"]}) for x in b for i in a if i['id']==x['id']]

Output: 输出:

100000 loops, best of 3: 3.9 µs per loop

If you are open to using pandas and a, b are pandas dataframes then here is a oneliner 如果您愿意使用pandas和a,b是pandas数据帧,那么这里是一个oneliner

a.loc[a.id.isin(b.id), 'score'] = b.loc[b.id.isin(a.id), 'newscore']

Converting a, b to dataframes is simple, just use pd.DataFrame.from_records 将a,b转换为数据帧很简单,只需使用pd.DataFrame.from_records


Another way of doing this if you can change "newscore" to "score" 如果你可以将“newscore”更改为“得分”,这样做的另一种方法

a = pd.DataFrame.from_records(a, index="id")
b = pd.DataFrame.from_records(b, index="id")
a.update(b)

Here are the timeit results 以下是timeit结果

In [10]: %timeit c = a.copy(); c.update(b)
702 µs ± 37.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

First create a dict of scores to update: 首先创建一个分数词汇来更新:

>>> new_d={d['id']:d for d in b}
>>> new_d
{1: {'id': 1, 'newscore': 500}, 2: {'id': 2, 'newscore': 600}}

Then iterate over a and update by id: 然后迭代a并按id更新:

for d in a:
    if d['id'] in new_d:
        d['score']=new_d[d['id']]['newscore']

>>> a
[{'id': 1, 'score': 500}, {'id': 2, 'score': 600}, {'id': 3, 'score': 400}]

Which could be even simpler as: 这可能更简单:

new_d={d['id']:d['newscore'] for d in b}
for d in a:
    if d['id'] in new_d:
        d['score']=new_d[d['id']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM