如何在python中减少元组列表

Question

我有一个数组，我想计算数组中每个项目的出现。

我已经设法使用地图功能来产生元组列表。

def mapper(a):
    return (a, 1)

r = list(map(lambda a: mapper(a), arr));

//output example: 
//(11817685, 1), (2014036792, 1), (2014047115, 1), (11817685, 1)

我期望reduce函数可以帮助我按每个元组中的第一个数字（id）对计数进行分组。 例如：

(11817685, 2), (2014036792, 1), (2014047115, 1)

我试过了

cnt = reduce(lambda a, b: a + b, r);

和其他一些方法，但是他们都无法解决问题。

注意感谢您提供有关解决问题的其他方法的所有建议，但是我只是在学习Python以及如何在此处实现map-reduce，并且为了使您易于理解，我对我的实际业务问题进行了很多简化，所以请请向我展示进行map-reduce的正确方法。

Answer 1

您可以使用Counter ：

from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
counter = Counter(arr)
print zip(counter.keys(), counter.values())

编辑：

如@ShadowRanger所指出的， Counter具有items()方法：

from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
print Counter(arr).items()

Answer 2

除了使用任何外部模块，您还可以使用一些逻辑，而无需任何模块即可：

track={}
if intr not in track:
    track[intr]=1
else:
    track[intr]+=1

示例代码：

对于这些类型的列表问题，有一种模式：

因此，假设您有一个列表：

a=[(2006,1),(2007,4),(2008,9),(2006,5)]

并且您想将此转换为字典，作为元组的第一个元素作为键，而将其转换为元组的第二个元素。 就像是：

{2008: [9], 2006: [5], 2007: [4]}

但是还有一个陷阱，您还希望那些具有不同值但键相同的键，例如（2006,1）和（2006,5）键相同，但值不同。 您希望这些值仅附加一个键，以便预期输出：

{2008: [9], 2006: [1, 5], 2007: [4]}

对于此类问题，我们执行以下操作：

首先创建一个新的字典，然后遵循以下模式：

if item[0] not in new_dict:
    new_dict[item[0]]=[item[1]]
else:
    new_dict[item[0]].append(item[1])

因此，我们首先检查key是否在新字典中，如果已经存在，然后将重复key的值添加到其值中：

完整代码：

a=[(2006,1),(2007,4),(2008,9),(2006,5)]

new_dict={}

for item in a:
    if item[0] not in new_dict:
        new_dict[item[0]]=[item[1]]
    else:
        new_dict[item[0]].append(item[1])

print(new_dict)

输出：

{2008: [9], 2006: [1, 5], 2007: [4]}

Answer 3

如果只需要cnt ，那么dict可能比这里的tuple list更好（如果需要这种格式，只需使用dict.items ）。

为此， collections模块具有有用的数据结构defaultdict 。

from collections import defaultdict
cnt = defaultdict(int) # create a default dict where the default value is
                       # the result of calling int
for key in arr:
  cnt[key] += 1 # if key is not in cnt, it will put in the default

# cnt_list = list(cnt.items())

Answer 4

在写完另一个问题的答案后，我记得这篇文章，并认为在此处写一个类似的答案会有所帮助。

这是在列表上使用reduce以获得所需输出的一种方法。

arr = [11817685, 2014036792, 2014047115, 11817685]

def mapper(a):
    return (a, 1)

def reducer(x, y):
    if isinstance(x, dict):
        ykey, yval = y
        if ykey not in x:
            x[ykey] = yval
        else:
            x[ykey] += yval
        return x
    else:
        xkey, xval = x
        ykey, yval = y
        a = {xkey: xval}
        if ykey in a:
            a[ykey] += yval
        else:
            a[ykey] = yval
        return a

mapred = reduce(reducer, map(mapper, arr))

print mapred.items()

哪些打印：

[(2014036792, 1), (2014047115, 1), (11817685, 2)]

请参阅链接的答案以获取更详细的说明。

如何在python中减少元组列表

问题描述

4 个解决方案

解决方案1
5 2017-12-13 02:50:27

解决方案2
1 2017-12-13 05:18:52

解决方案3
0 2017-12-13 02:52:02

解决方案4
0 2018-01-19 15:00:42

如何在python中减少元组列表

问题描述

4 个解决方案

解决方案1 5 2017-12-13 02:50:27

解决方案2 1 2017-12-13 05:18:52

解决方案3 0 2017-12-13 02:52:02

解决方案4 0 2018-01-19 15:00:42

解决方案1
5 2017-12-13 02:50:27

解决方案2
1 2017-12-13 05:18:52

解决方案3
0 2017-12-13 02:52:02

解决方案4
0 2018-01-19 15:00:42