简体   繁体   English

识别字典列表中特定键的相同值

[英]Identify same values for particular key in list of dictionaries

I have a list of dictionaries that look like this:我有一个看起来像这样的字典列表:

[
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 16, 'primary': '16', 'secondary': '8'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 8,  'primary': '8',  'secondary': '16'},
    {'ServiceID': 12, 'primary': '12', 'secondary': '20'},
    {'ServiceID': 8,  'primary': '8',  'secondary': '16'}
]

I would like create a new sorted dictionary where the we have the following:我想创建一个新的排序字典,其中我们有以下内容:

key = value of 'ServiceID'
key = value of how many times that particular 'ServiceID' is listed as a 'primary'
key = value of how many times that particular 'ServiceID' is listed as a 'secondary'

For example:例如:

[
    {'ServiceID': 8, 'primaryCount': 2, 'secondaryCount': 1},
    {'ServiceID': 12, 'primaryCount': 1, 'secondaryCount': 4},
    {'ServiceID': 16, 'primaryCount': 1, 'secondaryCount': 2},
    {'ServiceID': 120, 'primaryCount': 4, 'secondaryCount': 1}
]

Code that I have so far that doesn't quite seem to do what I desire, meaning that I am unsure as to how to appropriately increment the number of primaries and secondaries across the entire for loop as well as how to only ensure I am capturing the uniques for the 'ServiceID'到目前为止,我所拥有的代码似乎并没有达到我想要的效果,这意味着我不确定如何在整个 for 循环中适当地增加初级和次级的数量,以及如何确保我正在捕获'ServiceID' 的唯一性

I believe there is something wrong with my logic:我认为我的逻辑有问题:

temp_count_list = list()
temp_primary_counts = 0
temp_secondary_counts = 0

for sub_dict in new_list:
    temp_dict = dict()

    temp_dict['ServiceID'] = sub_dict['ServiceID']
    
    if sub_dict['ServiceID'] == int(sub_dict['primarySlice']):
        temp_dict['primaryCount'] = temp_primary_counts +=1

    if sub_dict['ServiceID'] == int(sub_dict['secondarySlice']):
        temp_dict['secondaryCount'] = temp_secondary_counts +=1

    temp_count_list.append(temp_dict)

Basic idea is, get all the ServiceID, primary, secondary in a dict (in code k), and then for each unique ServiceID count the frequency of that ServiceID in the primary and secondary.基本思想是,在一个dict(在代码k中)中获取所有的ServiceID、primary、secondary,然后为每个唯一的ServiceID计算该ServiceID在primary和secondary中的频率。

l = [
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 16, 'primary': '16', 'secondary': '8'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 8,  'primary': '8',  'secondary': '16'},
    {'ServiceID': 12, 'primary': '12', 'secondary': '20'},
    {'ServiceID': 8,  'primary': '8',  'secondary': '16'}
]

k =     {'ServiceID': [], 'primaryCount': [], 'secondaryCount': []}

for i in l:
    k['ServiceID'].append(i['ServiceID'])
    k['primaryCount'].append(i['primary'])
    k['secondaryCount'].append(i['secondary'])

res = {'ServiceID': 0, 'primaryCount': [], 'secondaryCount': []}

result = []

for i in sorted(set(k['ServiceID'])):
    res['ServiceID']=i
    res['primaryCount'] = k['primaryCount' ].count(str(i))
    res['secondaryCount'] = k['secondaryCount' ].count(str(i))
    result.append(res)
    res = {'ServiceID': 0, 'primaryCount': [], 'secondaryCount': []}

print(result)

output output

[
 {'ServiceID': 8, 'primaryCount': 2, 'secondaryCount': 1},
 {'ServiceID': 12, 'primaryCount': 1, 'secondaryCount': 4}, 
 {'ServiceID': 16, 'primaryCount': 1, 'secondaryCount': 2},
 {'ServiceID': 20, 'primaryCount': 4, 'secondaryCount': 1}
]

You can do the following (l is your list):您可以执行以下操作(l 是您的列表):

d={i['ServiceID']:{'primaryCount':0, 'secondaryCount':0} for i in l}

for i in l:
    d[int(i['primary'])]['primaryCount']+=1
    d[int(i['secondary'])]['secondaryCount']+=1

res=[{'ServiceID':i, 'primaryCount': k['primaryCount'], 'secondaryCount': k['secondaryCount']} for i, k in d.items()]

Output: Output:

>>> print(res)

[{'ServiceID': 20, 'primaryCount': 4, 'secondaryCount': 1}, {'ServiceID': 16, 'primaryCount': 1, 'secondaryCount': 2}, {'ServiceID': 8, 'primaryCount': 2, 'secondaryCount': 1}, {'ServiceID': 12, 'primaryCount': 1, 'secondaryCount': 4}]

It seems like the correct solution here would involve using collections.Counter s (or largely equivalently in this case, collections.defaultdict(int) s) to allow you to cheaply and easily increment counts without relying on them being adjacent in the input, and without using intermediate data structures that add pointless overhead;似乎这里正确的解决方案将涉及使用collections.Counter s (或在这种情况下基本上等同于collections.defaultdict(int) s),以允许您廉价且轻松地增加计数,而无需依赖它们在输入中相邻,并且不使用增加无意义开销的中间数据结构; why build the result all at once when you can count the parts you care about with simpler code, then build the result with equally simple code from those simple counts?当你可以用更简单的代码计算你关心的部分时,为什么要一次构建结果,然后用同样简单的代码从这些简单的计数中构建结果? You don't actually use the 'ServiceID' field in the input, so you may as well just count efficiently, and convert back to the preferred format at the end:您实际上并没有在输入中使用'ServiceID'字段,因此您不妨有效地计数,并在最后转换回首选格式:

import pprint  # For pretty-printing in the example
from collections import Counter

inp = [
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 16, 'primary': '16', 'secondary': '8'},
    {'ServiceID': 20, 'primary': '20', 'secondary': '12'},
    {'ServiceID': 8,  'primary': '8',  'secondary': '16'},
    {'ServiceID': 12, 'primary': '12', 'secondary': '20'},
    {'ServiceID': 8,  'primary': '8',  'secondary': '16'}
]

primarycount = Counter()
secondarycount = Counter()
for d in inp:
   primarycount[int(d['primary'])] += 1      # Counts times seen as primary
   secondarycount[int(d['secondary'])] += 1  # Counts times seen as secondary

# Just to see intermediate results
print(primarycount)
print(secondarycount)

# Make new list mapping each thing seen to its counts
# The union of keys ensures anything with even one count in input appears in the output
# Sorting the union before iterating gets desired output order
result = [{'ServiceID': sid, 'primaryCount': primarycount[sid], 'secondaryCount': secondarycount[sid]}
          for sid in sorted(primarycount.keys() | secondarycount.keys())]
pprint.pprint(result)

Try it online! 在线尝试!

which produces output:产生 output:

Counter({20: 4, 8: 2, 16: 1, 12: 1})
Counter({12: 4, 16: 2, 8: 1, 20: 1})
[{'ServiceID': 8, 'primaryCount': 2, 'secondaryCount': 1},
 {'ServiceID': 12, 'primaryCount': 1, 'secondaryCount': 4},
 {'ServiceID': 16, 'primaryCount': 1, 'secondaryCount': 2},
 {'ServiceID': 20, 'primaryCount': 4, 'secondaryCount': 1}]

This might be slightly wrong if some ServiceID s might be seen in the input, but never as a primary or secondary (they won't appear in the output at all, rather than appearing with zero counts; unclear which is correct), or if primary or secondary values sometimes appear where the ServiceID corresponding never appears in the input (they'll show up in the output with counts, rather than being omitted; again, unclear on which is correct), but it's relatively trivial to fix.如果在输入中可能会看到一些ServiceID ,但从不作为primarysecondary的(它们根本不会出现在 output 中,而不是以零计数出现;不清楚哪个是正确的),这可能会有点错误,或者如果primarysecondary值有时会出现在对应的ServiceID从未出现在输入中的位置(它们将显示在 output 中并带有计数,而不是被省略;同样,不清楚哪个是正确的),但修复起来相对简单。 Flipping both behaviors would just involve changing primarycount.keys() | secondarycount.keys()翻转这两种行为只需要更改primarycount.keys() | secondarycount.keys() primarycount.keys() | secondarycount.keys() to {d['ServiceID'] for d in inp} to ensure values come from the input ServiceID fields, not a combination of all values seen for primary and secondary . primarycount.keys() | secondarycount.keys(){d['ServiceID'] for d in inp}以确保值来自输入ServiceID字段,而不是看到的所有值的组合primarysecondary For the provided input, both approaches are equivalent (with the former being slightly faster in most cases, where there are many duplicate ServiceID s in the input).对于提供的输入,两种方法是等效的(在大多数情况下,前者快一些,因为输入中有许多重复的ServiceID )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当 Python 中的字典列表中的值之一相同时识别字典 - Identify dictionaries when one of the values are the same in a list of dictionaries in Python 在列表的不同字典中添加相同键的值 - Adding values of the same key in different dictionaries in a list 如何将具有相同键的词典值合并到词典列表中? - How to merge values of dictionaries with same key into one from a list of dictionaries? 使用机器人框架提取字典列表中特定键的所有值 - Extract all values for a particular Key in a list of dictionaries - using robot framework 如果字典的键相同,如何添加字典列表的值(列表)? - How do I add values (list) of a list of dictionaries if their key is same? 如何将两个列表中具有相同键的字典值相加? - How to sum dictionaries values with same key inside two list? 如果两个键值相同,如何对词典列表中的元素求和 - How to sum elements in list of dictionaries if two key values are the same 如何在列表中使用相同的键对词典值进行求和? - How to sum dictionaries values with same key inside a list? 在同一列中将通用字典的列表与通用键值一起写入CSV - Write a list of common dictionaries to CSV with common key values in the same column 将具有相同键的嵌套字典列表的值相加 - Sum the values ​of a list of nested dictionaries that have the same key
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM