计算字符串出现的最快方法

Question

我正在计算从文本文件中获取的一些字符串。 我已经做到了，但是我想知道还有没有其他可以快速找到的方法。 下面是我的代码：

首先，我在这里找到所有字符串并将所有这些字符串放入列表中。 然后，我将创建一个唯一查询列表，然后在使用count方法查找计数之后。

input.txt

shoes
memory card
earphones
led bulb
mobile
earphones
led bulb
mobile

上面是我的输入文件。

new = []
with open("input.txt") as inf:
for line in inf:
    line = line.strip("\n")
    new.append(line)
unique = list(set(new))
for i in unique:
   cnt = new.count(i)
   print i,cnt

和输出应如下所示：

   mobile 2
   memory card 1
   led bulb 2
   shoes 1
   earphones 2

Answer 1

您可以使用计数器：

from collections import Counter        

with open("input.txt") as inf:
   c = Counter(l.strip() for l in inf)

给出：

Counter({'led bulb': 2, 'earphones': 2, 'mobile': 2, 'memory card': 1, 'shoes': 1})

要么

for k,v in c.items():
    print(k,v)

这使：

memory card 1
mobile 2
earphones 2
led bulb 2
shoes 1

Answer 2

更好的方法是使用字典对它们进行计数：

count = {}
for L in open("input.txt"):
    count[L] = count.get(L, 0) + 1

最终得到一本从行到其各自计数的字典。

count方法之所以快速是因为它是用C语言实现的，但是仍然必须扫描每个唯一字符串的完整列表，因此您的实现是O（n ^ 2）（考虑使所有字符串都分开的最坏情况）。

计算字符串出现的最快方法

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-02-18 07:01:16

解决方案2
1 2015-02-18 06:59:11

计算字符串出现的最快方法

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-02-18 07:01:16

解决方案2 1 2015-02-18 06:59:11

解决方案1
3 已采纳 2015-02-18 07:01:16

解决方案2
1 2015-02-18 06:59:11