Least common words in a file

Question

I am interested in finding least common occurring text in a file.

from collections import Counter

# Load the file and extract the words
lines = open("mobydick.txt").readlines()
words = [ word for l in lines for word in l.rstrip().split() ]
print 'No of words in the file:', len(words)

# Use counter to get the counts
counts = Counter( words )

print 'Least common words:'
for word, count in sorted(counts.most_common()[:-3], key=lambda (word, count): (count, word), reverse=True):
    print '%s %s' % (word, count)

How do I limit just 3 words. It prints a bunch.

Answer 1

You are doing slice over list in a wrong way. Just feel the difference

print [1,2,3,4,5][:-3]
[1, 2]
print [1,2,3,4,5][-3:]
[3, 4, 5]

Answer 2

least_common = counts.most_common()[-3:]

Answer 3

just move the the :

for word, count in counts.most_common()[-3:]
    print '%s %s' % (word, count)

and as @Joran commented, you don't need to sort the result of most_common() since it's already ordered.

Least common words in a file

Question

3 answers

solution1
5 ACCPTED 2015-12-07 23:36:12

solution2
2 2015-12-07 23:35:44

solution3
2 2015-12-07 23:36:03

Least common words in a file

Question

3 answers

solution1 5 ACCPTED 2015-12-07 23:36:12

solution2 2 2015-12-07 23:35:44

solution3 2 2015-12-07 23:36:03

solution1
5 ACCPTED 2015-12-07 23:36:12

solution2
2 2015-12-07 23:35:44

solution3
2 2015-12-07 23:36:03