Python字典，查找重复的键值的总和

Question

I have a list of dictionaries that look like the following:我有一个字典列表，如下所示：

[{
    "id": "42f409d0-2cef-49b0-a027-59ed571cc2a9",
    "cost": 0.868422,
    "environment": "nonprod"
},
{
    "id": "42f409d0-2cef-49b0-a027-59ed571cc2a9",
    "cost": 0.017017,
    "environment": "prod"
},
{
    "id": "aa385029-afa6-4f1a-a1d9-d88b7d934699",
    "cost": 0.010304,
    "environment": "prod"
},
{
    "id": "b13a0676-6926-49db-808c-3c968a9278eb",
    "cost": 2.870336,
    "environment": "nonprod"
},
{
    "id": "b13a0676-6926-49db-808c-3c968a9278eb",
    "cost": 0.00455,
    "environment": "prod"
},
{
    "id": "b13a0676-6926-49db-808c-3c968a9278eb",
    "cost": 0.032458,
    "environment": "prod"
}]

The part im having a hard time understanding is how can I group these by id and environment and add up there costs for the given environment.我很难理解的部分是我如何按 id 和环境对它们进行分组，并为给定的环境加起来成本。

End result should be, only having a single pair of prod and nonprod for a given ID and have all costs for prod or nonprod added up under a single id in prod or a single id in nonprod.最终结果应该是，对于给定的 ID，只有一对 prod 和 nonprod，并且将 prod 或 nonprod 的所有成本加在 prod 中的单个 id 或 nonprod 中的单个 id 下。

I hope this is enough detail, thank you!我希望这足够详细，谢谢！

Answer 1

d = {}
for el in data:
    d[(el["id"], el["environment"])] = d.get((el["id"], el["environment"]), 0) + el["cost"]
d
# {('42f409d0-2cef-49b0-a027-59ed571cc2a9', 'nonprod'): 0.868422,
#  ('42f409d0-2cef-49b0-a027-59ed571cc2a9', 'prod'): 0.017017,
#  ('aa385029-afa6-4f1a-a1d9-d88b7d934699', 'prod'): 0.010304,
#  ('b13a0676-6926-49db-808c-3c968a9278eb', 'nonprod'): 2.870336,
#  ('b13a0676-6926-49db-808c-3c968a9278eb', 'prod'): 0.037008}

Answer 2

Try placing the dict values into a pandas dataframe, then use pandas's groupby function (I set the variable dicts to equal your above list of dictionaries):尝试将 dict 值放入 pandas 数据框中，然后使用 pandas 的 groupby 函数（我将变量dicts设置为等于您上面的字典列表）：

import pandas as pd
df = pd.DataFrame(dicts)
df.groupby(["id", "environment"], as_index=False).sum()

Output:输出：


                                      id    environment  cost
0   42f409d0-2cef-49b0-a027-59ed571cc2a9    nonprod      0.868422
1   42f409d0-2cef-49b0-a027-59ed571cc2a9    prod         0.017017
2   aa385029-afa6-4f1a-a1d9-d88b7d934699    prod         0.010304
3   b13a0676-6926-49db-808c-3c968a9278eb    nonprod      2.870336
4   b13a0676-6926-49db-808c-3c968a9278eb    prod         0.037008

Answer 3

In addition of above answers, a solution of python by creating a unique key of each transaction id|type will also work.除了上述答案之外，通过创建每个事务id|type的唯一键的 python 解决方案也将起作用。 This piece of code does exactly that and can even be made easier to read with defaultdict .这段代码正是这样做的，甚至可以使用defaultdict使其更易于阅读。

dictCounter = dict()
#assuming test is the list of dicts
for eachEntry in test:
    newUniqueKey = eachEntry["id"]+"|"+eachEntry["environment"]
    if newUniqueKey not in dictCounter.keys():
        dictCounter[newUniqueKey]=eachEntry["cost"]
    else:
        dictCounter[newUniqueKey]+=eachEntry["cost"]

Python字典，查找重复的键值的总和

问题描述

3 个解决方案

解决方案1
4 2022-07-13 17:12:40

解决方案2
1 2022-07-13 17:11:22

解决方案3
1 2022-07-13 17:20:00

Python字典，查找重复的键值的总和

问题描述

3 个解决方案

解决方案1 4 2022-07-13 17:12:40

解决方案2 1 2022-07-13 17:11:22

解决方案3 1 2022-07-13 17:20:00

解决方案1
4 2022-07-13 17:12:40

解决方案2
1 2022-07-13 17:11:22

解决方案3
1 2022-07-13 17:20:00