Python计算列表中的项目数并存储在字典中

Question

I have following list: 我有以下清单：

files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']

I want to count the number of items with a particular file extension and store it in a dictionary. 我想计算具有特定文件扩展名的项目数，并将其存储在字典中。

Expected output is: 预期输出为：

extn_dict = {'jpg': 3, 'mov': 2, 'pdf': 4}

I'm writing following code: 我正在编写以下代码：

for item in files_list:
    extn_dict[item[-3:]] = count(item) # I understand I should not have count() here but I'm not sure how to count them.

How can I count the number of items in the list with a particular extension? 如何计算具有特定扩展名的列表中的项目数？

Answer 1

>>> from collections import Counter
>>> files_list
['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
>>> c = Counter(x.split(".")[-1] for x in files_list)
>>> c
Counter({'pdf': 4, 'jpg': 3, 'mov': 2})
>>>

Answer 2

The easiest way is probably: 最简单的方法可能是：

>>> d = {}
>>> for item in files_list:
...     d[item[-3:]] = d.get(item[-3:], 0) + 1
... 
>>> d
{'pdf': 4, 'mov': 2, 'jpg': 3}

Answer 3

The easiest way is to loop over the list and use a dictionary to store your counts. 最简单的方法是遍历列表并使用字典存储计数。

files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 
              'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
counts = {}
for f in f:
    ext = f[-3:]
    if ext not in counts:
        counts[ext] = 0
    counts[ext] += 1

print counts
#{'pdf': 4, 'mov': 2, 'jpg': 3}

No doubt, there are other fancy solutions, but I think this is easier to understand. 毫无疑问，还有其他一些不错的解决方案，但是我认为这更容易理解。

If you can't assume that extension will always be 3 characters, then you can change the ext = line to: 如果不能假设扩展名总是3个字符，则可以将ext =行更改为：

ext = f.split(".")[-1]

As other posters have shown in their answers. 正如其他海报在其答案中所示。

Answer 4

files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
extension_set = [i.split('.')[-1] for i in files_list]
d = {j:extension_set.count(j) for j in extension_set}
print(d)

Analysis: 分析：

Current method - 10000 loops, best of 3: 25.3 µs per loop 当前方法-10000次循环，最佳3：每个循环25.3 µs

Counter - 10000 loops, best of 3: 30.5 µs per loop(best of 3: 33.3 µs per loop with import statement) 计数器-10000个循环，每个循环最好3：30.5 µs（使用import语句每个循环最好3：33.3 µs）

itertools - 10000 loops, best of 3: 41.1 µs per loop(best of 3: 44 µs per loop with import statement) itertools-10000个循环，每个循环最好3个：41.1 µs（使用import语句每个循环最好3个：44 µs）

Answer 5

You can use itertools.groupby : 您可以使用itertools.groupby ：

import itertools
files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
final_counts = {a:len(list(b)) for a, b in itertools.groupby(sorted(files_list, key=lambda x:x.split('.')[-1]), key=lambda x:x.split('.')[-1])}

Output: 输出：

{'pdf': 4, 'mov': 2, 'jpg': 3}

Answer 6

you can use the Counter function from collection module 您可以使用收集模块中的计数器功能

from collections import Counter
files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
temp = []
for item in files_list:
    temp.append(item[-3:])

print Counter(temp)
>>> Counter({'pdf': 4, 'jpg': 3, 'mov': 2})

Answer 7

使用计数器和映射而不是列表理解

Counter(map(lambda x : x.split('.')[-1], files_list))

Python计算列表中的项目数并存储在字典中

问题描述

7 个解决方案

解决方案1
12 已采纳 2018-02-15 18:50:37

解决方案2
2 2018-02-15 19:05:37

解决方案3
1 2018-02-15 18:50:38

解决方案4
1 2018-02-15 18:54:57

解决方案5
0 2018-02-15 18:53:13

解决方案6
0 2018-02-15 18:57:32

解决方案7
0 2018-02-15 18:58:45

Python计算列表中的项目数并存储在字典中

问题描述

7 个解决方案

解决方案1 12 已采纳 2018-02-15 18:50:37

解决方案2 2 2018-02-15 19:05:37

解决方案3 1 2018-02-15 18:50:38

解决方案4 1 2018-02-15 18:54:57

解决方案5 0 2018-02-15 18:53:13

解决方案6 0 2018-02-15 18:57:32

解决方案7 0 2018-02-15 18:58:45

解决方案1
12 已采纳 2018-02-15 18:50:37

解决方案2
2 2018-02-15 19:05:37

解决方案3
1 2018-02-15 18:50:38

解决方案4
1 2018-02-15 18:54:57

解决方案5
0 2018-02-15 18:53:13

解决方案6
0 2018-02-15 18:57:32

解决方案7
0 2018-02-15 18:58:45