计算Python列表中出现次数的最快方法

Question

我有一个Python列表，我想知道在这个列表中计算项目出现次数的最快方法是什么， '1' 。 在我的实际情况中，该项目可以发生数万次，这就是为什么我想要一个快速的方式。

['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']

collections模块是否有帮助？ 我正在使用Python 2.7

Answer 1

a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
print a.count("1")

它可能在C级别上进行了大量优化。

编辑：我随机生成了一个大型列表。

In [8]: len(a)
Out[8]: 6339347

In [9]: %timeit a.count("1")
10 loops, best of 3: 86.4 ms per loop

编辑编辑：这可以通过collections.Counter完成

a = Counter(your_list)
print a['1']

在我上一个时间示例中使用相同的列表

In [17]: %timeit Counter(a)['1']
1 loops, best of 3: 1.52 s per loop

我的时间过于简单化，并且取决于许多不同的因素，但它为您提供了一个关于性能的良好线索。

这是一些分析

In [24]: profile.run("a.count('1')")
         3 function calls in 0.091 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.091    0.091 <string>:1(<module>)
        1    0.091    0.091    0.091    0.091 {method 'count' of 'list' objects}

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Prof
iler' objects}



In [25]: profile.run("b = Counter(a); b['1']")
         6339356 function calls in 2.143 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.143    2.143 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 _weakrefset.py:68(__contains__)
        1    0.000    0.000    0.000    0.000 abc.py:128(__instancecheck__)
        1    0.000    0.000    2.143    2.143 collections.py:407(__init__)
        1    1.788    1.788    2.143    2.143 collections.py:470(update)
        1    0.000    0.000    0.000    0.000 {getattr}
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
  6339347    0.356    0.000    0.356    0.000 {method 'get' of 'dict' objects}

Answer 2

通过使用Counter字典以最有效的方式计算所有元素的出现以及python列表中最常见的元素及其出现值。

如果我们的python列表是： -

l=['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']

要查找python列表中每个项目的出现，请使用以下内容： -

\>>from collections import Counter

\>>c=Counter(l)

\>>print c

Counter({'1': 6, '2': 4, '7': 3, '10': 2})

要查找python列表中最多/最高出现的项目： -

\>>k=c.most_common()

\>>k

[('1', 6), ('2', 4), ('7', 3), ('10', 2)]

最高的一个 ： -

\>>k[0][1]

6

对于项目只需使用k [0] [0]

\>>k[0][0]

'1'

对于第n个最高项目及其在列表中的出现次数，请使用以下内容： -

**对于n = 2 **

\>>print k[n-1][0] # For item

2

\>>print k[n-1][1] # For value

4

Answer 3

lambda和map函数的组合也可以完成这项工作：

list_ = ['a', 'b', 'b', 'c']
sum(map(lambda x: x=="b", list_))
:2

Answer 4

您可以使用pandas ，将list转换为pd.Series然后只需使用.value_counts()

import pandas as pd
a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
a_cnts = pd.Series(a).value_counts().to_dict()

Input  >> a_cnts["1"], a_cnts["10"]
Output >> (6, 2)

Answer 5

您可以使用空格分隔的元素转换字符串中的列表，并根据要搜索的数字/字符将其拆分。

清洁快速的大清单..

>>>L = [2,1,1,2,1,3]
>>>strL = " ".join(str(x) for x in L)
>>>strL
2 1 1 2 1 3
>>>count=len(strL.split(" 1"))-1
>>>count
3

计算Python列表中出现次数的最快方法

问题描述

5 个解决方案

解决方案1
69 已采纳 2012-09-17 03:03:54

解决方案2
12 2015-02-11 09:52:58

解决方案3
0 2018-01-16 03:44:57

解决方案4
0 2018-10-09 07:43:03

解决方案5
-1 2016-12-03 18:18:18

计算Python列表中出现次数的最快方法

问题描述

5 个解决方案

解决方案1 69 已采纳 2012-09-17 03:03:54

解决方案2 12 2015-02-11 09:52:58

解决方案3 0 2018-01-16 03:44:57

解决方案4 0 2018-10-09 07:43:03

解决方案5 -1 2016-12-03 18:18:18

解决方案1
69 已采纳 2012-09-17 03:03:54

解决方案2
12 2015-02-11 09:52:58

解决方案3
0 2018-01-16 03:44:57

解决方案4
0 2018-10-09 07:43:03

解决方案5
-1 2016-12-03 18:18:18