I am a beginner programmer attempting to build a simple program. It should count every word in the file but as I wrote, it only counts the last line of text.
tm = open('myfile.txt', 'r')
for line in tm:
line = line.replace ('\n', '')
line = line.strip()
line = line.translate(None, '!#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
line = line.lower()
line = line.split(' ')
list = line
dict ={}
for word in list:
dict[word]=1
if word in dict:
count = dict[word]
count += 1
dict[word] = count
else:
dict[word]=1
for word,count in dict.iteritems():
print word + ": " + str(count)
My output is this
about: 1
to: 1
subscribe: 1
hear: 1
new: 1
our: 1
newsletter: 1
email: 1
ebooks: 2
for a 500 page document any help is appreciated
Replace this line in your code:
list = line # that's not how you add elements to a list!
With this other:
list.extend(line)
And it'd be a good idea to rename to lst
the list
variable, because list
is a built-in and it's a bad idea to overwrite it. Same thing for dict
, you should not use that as a variable name.
Another good idea: use a Counter
object to keep track of the word frequency, it's much easier than updating the dictionary's counter values by hand. The whole block of code where you create and fill the dictionary can be replaced by this:
from collections import Counter
d = Counter(lst) # notice the suggested variable names
As Oscar said, you should add your array items to your list instead of replace it. Try to use extend instead of append though.
list.extend(line)
you can add all item from array to list at one time.
append is for adding single item to list.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.