简体   繁体   中英

Beginning word counting program only produces output for the last line in python

I am a beginner programmer attempting to build a simple program. It should count every word in the file but as I wrote, it only counts the last line of text.

tm = open('myfile.txt', 'r')
for line in tm:
    line = line.replace ('\n', '')
    line = line.strip()
    line = line.translate(None, '!#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
    line = line.lower()
    line = line.split(' ')
    list = line
dict ={}
for word in list:
    dict[word]=1
if word in dict:
    count = dict[word]
    count += 1
    dict[word] = count
else:
    dict[word]=1
for word,count in dict.iteritems():
    print word + ": " + str(count)

My output is this

about: 1
to: 1
subscribe: 1
hear: 1
new: 1
our: 1
newsletter: 1
email: 1
ebooks: 2

for a 500 page document any help is appreciated

Replace this line in your code:

list = line # that's not how you add elements to a list!

With this other:

list.extend(line)

And it'd be a good idea to rename to lst the list variable, because list is a built-in and it's a bad idea to overwrite it. Same thing for dict , you should not use that as a variable name.

Another good idea: use a Counter object to keep track of the word frequency, it's much easier than updating the dictionary's counter values by hand. The whole block of code where you create and fill the dictionary can be replaced by this:

from collections import Counter
d = Counter(lst) # notice the suggested variable names

As Oscar said, you should add your array items to your list instead of replace it. Try to use extend instead of append though.

list.extend(line)

you can add all item from array to list at one time.

append is for adding single item to list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM