简体   繁体   English

Python中的元组键过滤dict的值?

[英]Filter the value of dict with tuple key in Python?

Let's say I have a very large dictionary like where the key is a tuple.假设我有一个非常大的字典,比如键是一个元组。 Here is an example:这是一个例子:

dic = {('A', 'B'): 1,
       ('A', 'C'): 2,
       ('B', 'C'): 3}

I am trying to filter the values of dict by only one element of the key.我试图仅通过键的一个元素过滤 dict 的值。 Here is what I am doing:这是我正在做的事情:

my_list = []

for k, v in dic.items():
    if "A" in k:
        my_list.append(v)

print(my_list)
[1, 2]

I am wondering is there a better way to perform this without looping through?我想知道是否有更好的方法来执行此操作而无需循环?

More pythonic for me:对我来说更pythonic:

dic = {('A', 'B'): 1,
       ('A', 'C'): 2,
       ('B', 'C'): 3}
dic = [el[1] for el in dic.items() if 'A' in el[0]]
print(dic)
# Returns [1, 2]

Or with tuple unpacking:或者使用元组解包:

dic = [v for k, v in dic.items() if 'A' in k]

With the information that you're searching repeatedly, this can be improved.使用您反复搜索的信息,这可以得到改善。 Basically, you want to avoid traversing the entire data structure repeatedly.基本上,您希望避免重复遍历整个数据结构。 As long as memory permits, you can precalculate either a map from elements of the keys to a set of the matching keys or just to the values themselves.只要 memory 允许,您就可以预先计算 map 从键的元素到一组匹配的键,或者只是到值本身。

from collections import defaultdict
index = defaultdict(set)
for k in dic:
    for element in k:
        index[element].add(k)

Then something like:然后是这样的:

my_list = [dic[k] for k in index["A"]]

This might still underperform if you're just doing a couple of searches, since building the index is more expensive than one single search.如果您只是进行几次搜索,这可能仍然表现不佳,因为构建索引比一次搜索更昂贵。 You'd want to profile to determine that.您需要分析以确定这一点。

There are variations possible on this - for instance you can maintain the index as a parallel structure to the dict itself, modifying it as you add and delete elements.这方面可能存在变化 - 例如,您可以将索引维护为与 dict 本身平行的结构,在添加和删除元素时对其进行修改。 That's more complex but means you distribute the cost of computing the index.这更复杂,但意味着您分配计算索引的成本。 And don't have to do it more than once unless something goes wrong.除非出现问题,否则不必多次这样做。 But this is one basic idea.但这是一个基本的想法。

I would restructure your dictionary.我会重组你的字典。

dic = {("A", "B"): 1, ("A", "C"): 2, ("B", "C"): 3}

This is pretty much the same as:这几乎与以下内容相同:

better_dictionary = {"A": [1,2], "B": [1,3], "C": [2,3]}

Here is some code:这是一些代码:

dic = {("A", "B"): 1, ("A", "C"): 2, ("B", "C"): 3}
better_dictionary = {}
my_list = []

# (one time) init better dict: O(2n) -- with n being the number of tuples
for k, v in dic.items():
    for i in k:
        if i in better_dictionary:
            better_dictionary[i].append(v)
        else:
            better_dictionary[i] = [v]

my_list = better_dictionary["A"] # get values: O(1)
print(my_list)
# [1, 2]

The list comprehension suggested by @Yevgeniy Kosmak is the most Pythonic way imo. @Yevgeniy Kosmak 建议的列表理解是 imo 中最 Pythonic 的方式。 However, if you don't want an explicit loop, then you could use filter and zip (which is pretty ugly imo):但是,如果您不想要一个显式循环,那么您可以使用filterzip (这在 imo 中非常难看):

list(zip(*filter(lambda x: 'A' in x[0], dic.items())))[1]

Here, with filter , we create an iterable of 2-tuples where the first elements are keys with A in it and the second elements are values of dic .在这里,使用filter ,我们创建了一个可迭代的 2 元组,其中第一个元素是其中包含A的键,第二个元素是dic的值。

Then we unpack this iterable and use zip to create an iterable of tuples (we want the second element).然后我们解压这个可迭代对象并使用zip创建一个元组可迭代对象(我们想要第二个元素)。

Output: Output:

(1, 2)

If you need to do this multiple times, it's maybe a better idea to create another dictionary where keys are the first elements of the keys in dic and values are list of values in dic whose key's first elements match.如果您需要多次执行此操作,最好创建另一个字典,其中键是dic中键的第一个元素,值是dic中键的第一个元素匹配的值列表。 That's suggested by @luku.这是@luku 建议的。 Another way to do the same job is to use dict.setdefault method to traverse dic and append to values:做同样工作的另一种方法是使用dict.setdefault方法遍历dic和 append 到值:

out = {}
for (k,_),v in dic.items():
    out.setdefault(k,[]).append(v)

Then you can do:然后你可以这样做:

>> print(out['A'])
[1, 2]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM