繁体   English   中英

从 csv 词频列表中删除重复项

[英]Removing duplicates from a csv word frequency list

目前我被这个词频列表难住了。 我几乎得到了最终结果,即打印每个单词及其计数,但我似乎无法摆脱重复项。 如果有人能帮助我完成最后一部分,我将不胜感激。

这是我到目前为止

import csv

input_file = input()
##contents of input_file are -- hello,cat,man,hey,dog,boy,Hello,man,cat,woman,dog,Cat,hey,boy

with open(input_file, 'r') as csvfile:
    csvfile = csv.reader(csvfile)
    
    count = 0
    
    for line in csvfile:
        for word in line:
            count = line.count(word)
            ##I am trying to print the words and count without any duplicates
            print(word, count)

您可以使用dictionary ,因为它不允许重复键。 看一看

with open(input_file, 'r') as csvfile:
    csvfile = csv.reader(csvfile)
    
    my_words = dict()
    
    for line in csvfile:
        for word in line:
            try:
                # If it's duplicated, add one
                my_words[word] += 1
            except KeyError:
                # If it's the first occurrence, set as one
                my_words[word] = 1
     for word, count in my_words.items():   
         print(word, count)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM