简体   繁体   English

用Python计算文件中的单词

[英]Counting words from a file in Python

I'm completely new to Python, but to my own surprise I've produced this working piece of code: 我是Python的新手,但令我惊讶的是,我编写了下面的代码:

if __name__ == "__main__":
with open("wordlist.txt") as infile:
    for line in infile:
        print(line)    



with open ("cv000_29416.txt", "r") as myfile:
   data=myfile.read().replace('\n', '')
print (data.count("bad"))      

The point is, that I want to count the words from wordlist.txt in cv000_29416.txt. 关键是,我想计算cv000_29416.txt中wordlist.txt中的单词。

(So wordlist.txt contains for example twenty words like 'bad', 'good' etcetera, and cv000_29416.txt is a long text, and I want to count how many time 'bad', 'good' etcetera occure in cv000_29416.txt) (因此,wordlist.txt包含20个单词,例如“坏”,“好”等,而cv000_29416.txt是一个长文本,我想计算cv000_29416.txt中出现“坏”,“好”等次数的时间)

Can I insert that somewhere in the seconds piece of code? 我可以在几秒钟的代码中插入它吗?

Thank you! 谢谢! and sorry for bad English 对不起,英语不好

# create a collection of the words that want to count
with open('wordlist.txt') as infile:
    counts = {}
    for line in infile:
        for word in line.split():
            counts[word] = 0

# increment the count of the words that you really care about
with open("cv000_29416.txt") as infile:
    for line in infile:
        for word in line.split():
            if word in counts:
                counts[word] += 1

for word,count in counts.items():
    print(word, "appeared", count, "times")

use a collections.Counter dict to count all the words: 使用collections.Counter字典来计算所有单词:

from collections import Counter
with open ("cv000_29416.txt", "r") as myfile:
   data = Counter(myfile.read().split())
print (data["bad"])   

To put it together, presuming each word is on a separate line in wordlist.txt: 综上所述,假定每个单词都在wordlist.txt中的单独一行上:

from collections import Counter
with open ("cv000_29416.txt", "r") as myfile,open("wordlist.txt") as infile:
    data = Counter(myfile.read().split())
    for line in infile:
        print(data.get(line.rstrip(),0))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM