[英]Python: sort dict by values, print key and value
我正在尝试对文件中的所有单词进行排序,并返回引用的前20个单词。 这是我的代码:
import sys
filename = sys.argv[2]
def helper_function(filename):
the_file = open(filename, 'r')
words_count = {}
lines_in_file = the_file.readlines()
for line in lines_in_file:
words_list = line.split()
for word in words_list:
if word in words_count:
words_count[word.lower()] += 1
else:
words_count[word.lower()] = 1
return words_count
def print_words(filename):
words_count = helper_function(filename)
for w in sorted(words_count.keys()): print w, words_count[w]
def print_top(filename):
words_count = helper_function(filename)
for w in sorted(words_count.values()): print w
def main():
if len(sys.argv) != 3:
print 'usage: ./wordcount.py {--count | --topcount} file'
sys.exit(1)
option = sys.argv[1]
filename = sys.argv[2]
if option == '--count':
print_words(filename)
elif option == '--topcount':
print_top(filename)
else:
print 'unknown option: ' + option
sys.exit(1)
if __name__ == '__main__':
main()
我定义print_top()的方式返回的是words_count字典的排序值,但我想这样打印:Word:Count
您的建议非常有价值!
您很亲近,只需根据值对dict项目进行排序(这就是itemgetter所做的事情)。
>>> word_count = {'The' : 2, 'quick' : 8, 'brown' : 4, 'fox' : 1 }
>>> from operator import itemgetter
>>> for word, count in reversed(sorted(word_count.iteritems(), key=itemgetter(1))):
... print word, count
...
quick 8
brown 4
The 2
fox 1
对于“前20名”,我建议您看一下heapq
>>> import heapq
>>> heapq.nlargest(3, word_count.iteritems(), itemgetter(1))
[('quick', 8), ('brown', 4), ('The', 2)]
要以“键:值”的形式获取输出,在字典中填充了值和键之后,请使用函数的返回值,如下所示:
def getAllKeyValuePairs():
for key in sorted(dict_name):
return key + ": "+ str(dict_name[key])
或特定的键值对:
def getTheKeyValuePair(key):
if (key in dict_name.keys()):
return key + ": "+ str(dict_name[key])
else:
return "No such key (" + key + ") in the dictionary"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.