简体   繁体   English

Python。 如何减去2个字典

[英]Python. How to subtract 2 dictionaries

I have 2 dictionaries, A and B. A has 700000 key-value pairs and B has 560000 key-values pairs.我有 2 个字典,A 和 B。A 有 700000 个键值对,B 有 560000 个键值对。 All key-value pairs from B are present in A, but some keys in A are duplicates with different values and some have duplicated values but unique keys. B 中的所有键值对都存在于 A 中,但 A 中的某些键是具有不同值的重复项,有些具有重复值但具有唯一键。 I would like to subtract B from A, so I can get the remaining 140000 key-value pairs.我想从 A 中减去 B,这样我就可以得到剩下的 140000 个键值对。 When I subtract key-value pairs based on key identity, I remove lets say 150000 key-value pairs because of the repeated keys.当我根据键标识减去键值对时,由于重复键,我删除了 150000 个键值对。 I want to subtract key-value pairs based on the identity of BOTH key AND value for each key-value pair, so I get 140000. Any suggestion would be welcome.我想根据每个键值对的 BOTH 键和​​值的标识减去键值对,所以我得到 140000。欢迎提出任何建议。

This is an example:这是一个例子:

A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}

I DO want to get: AB = {'10':1, '12':1, '10':2, '11':3}我确实想得到:AB = {'10':1, '12':1, '10':2, '11':3}

I DO NOT want to get:我不想得到:

a) When based on keys: a) 基于密钥时:

{'10':1, '12':1, '10':2}

or或者

b) When based on values: b) 当基于值时:

{'11':3}

To get items in A that are not in B, based just on key:要获取 A 中不在 B 中的项目,仅基于键:

C = {k:v for k,v in A.items() if k not in B}

To get items in A that are not in B, based on key and value:要根据键和值获取 A 中不在 B 中的项目:

C = {k:v for k,v in A.items() if k not in B or v != B[k]}

To update A in place (as in A -= B ) do:要就地更新 A (如A -= B ),请执行以下操作:

from collections import deque
consume = deque(maxlen=0).extend
consume(A.pop(key, None) for key in B)

(Unlike using map() with A.pop , calling A.pop with a None default will not break if a key from B is not present in A. Also, unlike using all , this iterator consumer will iterate over all values, regardless of truthiness of the popped values.) (与使用 map() 和A.pop ,如果 A 中不存在来自 B 的键,则使用 None 默认值调用A.pop不会中断。此外,与使用all不同,此迭代器使用者将迭代所有值,而不管弹出值的真实性。)

一个简单、直观的方法是

dict(set(a.items()) - set(b.items()))
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}

You can't have duplicate keys in Python.在 Python 中不能有重复的键。 If you run the above, it will get reduced to:如果你运行上面的,它会减少到:

A={'11': 3, '10': 2, '12': 1}
B={'11': 2}

But to answer you question, to do A - B (based on dict keys):但要回答你的问题,做 A - B(基于字典键):

all(map( A.pop, B))   # use all() so it works for Python 2 and 3.
print A # {'10': 2, '12': 1}

Another way of using the efficiency of sets.另一种使用集合效率的方法。 This might be more multipurpose than the answer by @brien .可能@brien的答案更具多功能 His answer is very nice and concise, so I upvoted it.他的回答非常好和简洁,所以我赞成。

diffKeys = set(a.keys()) - set(b.keys())
c = dict()
for key in diffKeys:
  c[key] = a.get(key)

EDIT: There is the assumption here, based on the OP's question, that dict B is a subset of dict A, that the key/val pairs in B are in A. The above code will have unexpected results if you are not working strictly with a key/val subset.编辑:这里有一个假设,基于 OP 的问题,dict B 是 dict A 的子集,B 中的键/val 对在 A 中。如果您不严格使用,上面的代码将产生意外结果键/值子集。 Thanks to Steven for pointing this out in his comment.感谢史蒂文在他的评论中指出这一点。

Since I can not (yet) comment: the accepted answer will fail if there are some keys in B not present in A.因为我不能(还)评论:如果 B 中的某些键不存在于 A 中,则接受的答案将失败。

Using dict.pop with a default would circumvent it (borrowed from How to remove a key from a Python dictionary? ):使用带有默认值的 dict.pop 会绕过它(借自How to remove a key from a Python dictionary? ):

all(A.pop(k, None) for k in B)

or或者

tuple(A.pop(k, None) for k in B)

dict-views : 字典视图

Keys views are set-like since their entries are unique and hashable.键视图是类似设置的,因为它们的条目是唯一且可散列的。 If all values are hashable, so that (key, value) pairs are unique and hashable, then the items view is also set-like.如果所有值都是可散列的,因此 (key, value) 对是唯一且可散列的,则项目视图也是类似设置的。 (Values views are not treated as set-like since the entries are generally not unique.) For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^). (值视图不被视为类集合,因为条目通常不是唯一的。)对于类集合视图,为抽象基类 collections.abc.Set 定义的所有操作都可用(例如,==、< , 或 ^)。

So you can:这样你就可以:

>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> B = {'11':1, '11':2}
>>> A.items() - B.items()
{('11', 3), ('12', 1), ('10', 2)}
>>> dict(A.items() - B.items())
{'11': 3, '12': 1, '10': 2}

For python 2 use dict.viewitems .对于 python 2 使用dict.viewitems

PS You can't have duplicate keys in dict. PS你不能在dict中有重复的键。

>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> A
{'10': 2, '11': 3, '12': 1}
>>> B = {'11':1, '11':2}
>>> B
{'11': 2}
result = A.copy()
[result.pop(key) for key in B if B[key] == A[key]]

Based on only keys assuming A is a superset of B or B is a subset of A:仅基于假设 A 是 B 的超集或 B 是 A 的子集的键:

Python 3: c = {k:a[k] for k in a.keys() - b.keys()} Python 3:c = {k:a[k] for k in a.keys() - b.keys()}

Python 2: c = {k:a[k] for k in list(set(a.keys())-set(b.keys()))} Python 2:c = {k:a[k] for k in list(set(a.keys())-set(b.keys()))}

Based on keys and can be used to update a in place as well @PaulMcG answer基于密钥,也可用于更新就地@PaulMcG 答案

For subtracting the dictionaries, you could do :为了减去字典,你可以这样做:

A.subtract(B) A.减(B)

Note: This will give you negative values in a situation where B has keys that A does not.注意:在 B 有 A 没有的键的情况下,这会给你负值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM