繁体   English   中英

Python - 二维列表 - 在一列中查找重复项并在另一列中求和

[英]Python - 2D list - find duplicates in one column and sum values in another column

我有一个 2D 列表,其中分别包含足球运动员的姓名、他们进球的次数以及他们尝试射门的次数。

player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 20, 35], ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 14], ['Adam', 10, 15]]

从这个列表中,我试图返回另一个列表,该列表仅显示每个玩家的一个实例,以及他们各自的进球数和进球数,如下所示:

player_stats_totals = [['Adam', 30, 45], ['Kyle', 12, 18], ['Jo', 26, 49], ['Charlie', 31, 58]]

在 Stack Overflow 上搜索后,我能够(从这个线程)学习如何返回重复播放器的索引

x = [player_stats[i][0] for i in range (len(player_stats))]

for i in range (len(x)):
    if (x[i] in x[:i]) or (x[i] in x[i+1:]): print (x[i], i)

但被困在如何进行之后,如果这种方法确实与我需要的东西严格相关(?)

返回所需总计列表的最有效方法是什么?

使用字典来累积给定玩家的值:

player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 20, 35], ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 14], ['Adam', 10, 15]]

lookup = {}
for player, first, second in player_stats:
    
    # if the player has not been seen add a new list with 0, 0 
    if player not in lookup:
        lookup[player] = [0, 0]
    
    # get the accumulated total so far 
    first_total, second_total = lookup[player]
    
    # add the current values to the accumulated total, and update the values 
    lookup[player] = [first_total + first, second_total + second]

# create the output in the expected format
res = [[player, first, second] for player, (first, second) in lookup.items()]
print(res)

Output

[['Adam', 30, 45], ['Kyle', 12, 18], ['Jo', 26, 49], ['Charlie', 31, 58]]

更高级的pythonic版本是使用collections.defaultdict

from collections import defaultdict

player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 20, 35],
                ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 14], ['Adam', 10, 15]]

lookup = defaultdict(lambda: [0, 0])
for player, first, second in player_stats:
    # get the accumulated total so far
    first_total, second_total = lookup[player]

    # add the current values to the accumulated total, and update the values
    lookup[player] = [first_total + first, second_total + second]

# create the output in the expected format
res = [[player, first, second] for player, (first, second) in lookup.items()]

print(res)

这种方法具有跳过初始化的优点。 两者的方法都是 O(n)。

笔记

表达方式:

res = [[player, first, second] for player, (first, second) in lookup.items()]

是一个列表推导,相当于下面的 for 循环:

res = []
for player, (first, second) in lookup.items():
    res.append([player, first, second])

此外,请阅读本文以了解拆包。

您要做的是使用字典,其中键是球员姓名,值是包含 [goals, shot] 的列表。 构建它看起来像这样:

all_games_stats = {}
for stat in player_stats:
    player, goals, shots = stat
    if player not in all_games_stats:
        all_games_stats[player] = [goals, shots]
    else:
        stat_list = all_games_stats[player]
        stat_list[0] += goals
        stat_list[1] += shots

然后,如果您想将玩家及其统计数据表示为列表,您可以这样做:list(all_games_stats.items())

您可以将列表转换为字典。 (一旦完成,它总是可以改回来)这有效:

player_stats = [['Adam', 5, 10], ['Kyle', 12, 18], ['Jo', 
20, 35], ['Adam', 15, 20], ['Charlie', 31, 58], ['Jo', 6, 
14], ['Adam', 10, 15]]

new_stats = {}


for item in player_stats:
    if not item[0] in new_stats:
        new_stats[item[0]] = [item[1],item[2]]
    else:
        new_stats[item[0]][0] += item[1]
        new_stats[item[0]][1] += item[2]

print(new_stats)

我也不妨提交一些东西。 这是另一种带有一些列表理解的方法:

# Unique values to new dictionary with goal and shots on goal default entries 
agg_stats = dict.fromkeys(set([p[0] for p in player_stats]), [0, 0])

# Iterate over the player stats list
for player in player_stats:
    # Set entry to sum of current and next stats values for the corresponding player.
    agg_stats[player[0]] = [sum([agg_stats.get(player[0])[i], stat]) for i, stat in enumerate(player[1:])]

还有另一种方式,将整个三元组(包括名称)存储在 dict 中并更新它们:

stats = {}
for name, goals, attempts in player_stats:
    entry = stats.setdefault(name, [name, 0, 0])
    entry[1] += goals
    entry[2] += attempts
player_stats_totals = list(stats.values())

为了好玩,一个复数的解决方案,这使得添加很好但需要烦人的转换:

from collections import defaultdict

tmp = defaultdict(complex)
for name, *stats in player_stats:
    tmp[name] += complex(*stats)
player_stats_totals = [[name, int(stats.real), int(stats.imag)]
                       for name, stats in tmp.items()]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM