简体   繁体   中英

trying to create a dictionary from a text file but

so, I have text file (a paragraph) and I need to read the file and create a dictionary containing each different word from the file as a key and the corresponding value for each key will be an integer showing the frequency of the word in the text file. an example of what the dictionary should look like:

{'and':2, 'all':1, 'be':1, 'is':3} etc.

so far I have this,

def create_word_frequency_dictionary () :
filename = 'dictionary.txt'
infile = open(filename, 'r') 
line = infile.readline()

my_dictionary = {}
frequency = 0

while line != '' :
    row = line.lower()
    word_list = row.split()
    print(word_list)
    print (word_list[0])
    words = word_list[0]
    my_dictionary[words] = frequency+1
    line = infile.readline()

infile.close()

print (my_dictionary)

create_word_frequency_dictionary()

any help would be appreciated thanks.

Documentation defines collections module as "High-performance container datatypes". Consider using collections.Counter instead of re-inventing the wheel.

from collections import Counter
filename = 'dictionary.txt'
infile = open(filename, 'r') 
text = str(infile.read())
print(Counter(text.split()))

Update: Okay, I fixed your code and now it works, but Counter is still a better option:

def create_word_frequency_dictionary () :
    filename = 'dictionary.txt'
    infile = open(filename, 'r') 
    lines = infile.readlines()

    my_dictionary = {}

    for line in lines:
        row = str(line.lower())
        for word in row.split():
            if word in my_dictionary:
                 my_dictionary[word] = my_dictionary[word] + 1
            else:
                 my_dictionary[word] = 1

    infile.close()
    print (my_dictionary)

create_word_frequency_dictionary()

If you are not using version of python which has Counter:

>>> import collections
>>> words = ["a", "b", "a", "c"]
>>> word_frequency = collections.defaultdict(int)
>>> for w in words:
...   word_frequency[w] += 1
... 
>>> print word_frequency
defaultdict(<type 'int'>, {'a': 2, 'c': 1, 'b': 1})

只需将my_dictionary[words] = frequency+1替换为my_dictionary[words] = my_dictionary[words]+1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM