[英]Counting words in python from the text file
需要打开文本文件,并找到另一个文件中给定名称的出现次数。 程序应写名称; 计数对,用分号分隔成.csv格式的文件
它应该看起来像:
简; 77
赫克托; 34
安娜; 39
...
试图使用“ Counter”,但它看起来像一个列表,所以我认为这是执行任务的错误方法
import re
import collections
from collections import Counter
wanted = re.findall('\w+', open('iliadcounts.csv').read().lower())
cnt = Counter()
words = re.findall('\w+', open('pg6130.txt').read().lower())
for word in words:
if word in wanted:
cnt[word] += 1
print (cnt)
但这绝对不是此任务的正确代码...
您可以一次将整个单词列表提供给Counter,它将为您计数。 然后,您可以通过迭代遍历仅打印wanted
的单词:
import re
import collections
from collections import Counter
# create some demo data as I do not have your data at hand - uses your filenames
def create_demo_files():
with open('iliadcounts.csv',"w") as f:
f.write("hug,crane,box")
with open('pg6130.txt',"w") as f:
f.write("hug,shoe,blues,crane,crane,box,box,box,wood")
create_demo_files()
# work with your files
with open('iliadcounts.csv') as f:
wanted = re.findall('\w+', f.read().lower())
with open('pg6130.txt') as f:
cnt = Counter( re.findall('\w+', f.read().lower()) )
# printed output for all words in wanted (all words are counted)
for word in wanted:
print("{}; {}".format(word, cnt.get(word)))
# would work as well:
# https://docs.python.org/3/library/string.html#string-formatting
# print(f"{word}; {cnt.get(word)}")
输出:
hug; 1
crane; 2
box; 3
或者您可以打印整个计数器:
print(cnt)
输出:
Counter({'box': 3, 'crane': 2, 'hug': 1, 'shoe': 1, 'blues': 1, 'wood': 1})
链接:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.