按字典中的第一个键分组并对其他键值内的值应用计算，python 字典？

Question

我有以下测试清单：

testing = [
{'score': [('a', 90)],'text': 'abc'},
{'score': [('a', 80)], 'text': 'kuku'},
{'score': [('a', 70)], 'text': 'lulu'},
{'score': [('b', 90)], 'text': 'dalu'},
{'score': [('b', 86)], 'text': 'pupu'},
{'score': [('b', 80)], 'text': 'mumu'},
{'score': [('c', 46)], 'text': 'foo'},
{'score': [('c', 26)], 'text': 'too'}
]

我想通过每个字典 go ，按score的元组第一个元素（a、b 或 c）分组并平均第二个元素 + 为 score 的元组的每个第一个元素收集text s 以获得以下信息：

{"a": {"avg_score": 80, "texts_unique": ['abc', 'kuku', 'lulu']}, "b": the same logic... }

我见过 pandas 方法，有什么最佳做法吗？

Answer 1

尝试：

from statistics import mean

testing = [
    {"score": [("a", 90)], "text": "abc"},
    {"score": [("a", 80)], "text": "kuku"},
    {"score": [("a", 70)], "text": "lulu"},
    {"score": [("b", 90)], "text": "dalu"},
    {"score": [("b", 86)], "text": "pupu"},
    {"score": [("b", 80)], "text": "mumu"},
    {"score": [("c", 46)], "text": "foo"},
    {"score": [("c", 26)], "text": "too"},
]

out = {}
for d in testing:
    out.setdefault(d["score"][0][0], []).append((d["score"][0][1], d["text"]))

out = {
    k: {
        "avg_score": mean(i for i, _ in v),
        "texts_unique": list(set(i for _, i in v)),
    }
    for k, v in out.items()
}
print(out)

印刷：

{
    "a": {"avg_score": 80, "texts_unique": ["abc", "kuku", "lulu"]},
    "b": {
        "avg_score": 85.33333333333333,
        "texts_unique": ["mumu", "dalu", "pupu"],
    },
    "c": {"avg_score": 36, "texts_unique": ["foo", "too"]},
}

Answer 2

您可以使用itertools.groupby围绕字母键对数据进行分组，然后使用帮助程序 function 为每个字母返回所需的 object：

import itertools

def grouper(g):
    return { 'avg_score' : sum(t['score'][0][1] for t in g)/len(g), 'texts_unique' : list(set(t['text'] for t in g)) }

res = { k : grouper(list(g)) for k, g in itertools.groupby(testing, key=lambda t:t['score'][0][0]) }

Output：

{
    "a": {
        "avg_score": 80.0,
        "texts_unique": [
            "abc",
            "lulu",
            "kuku"
        ]
    },
    "b": {
        "avg_score": 85.33333333333333,
        "texts_unique": [
            "mumu",
            "dalu",
            "pupu"
        ]
    },
    "c": {
        "avg_score": 36.0,
        "texts_unique": [
            "foo",
            "too"
        ]
    }
}

按字典中的第一个键分组并对其他键值内的值应用计算，python 字典？

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-09-27 00:26:03

解决方案2
0 2022-09-27 00:44:11

按字典中的第一个键分组并对其他键值内的值应用计算，python 字典？

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-09-27 00:26:03

解决方案2 0 2022-09-27 00:44:11

解决方案1
1 已采纳 2022-09-27 00:26:03

解决方案2
0 2022-09-27 00:44:11