I am writing a function that returns the top 10 most frequent word lengths in a file called wordlist.txt that contains all words starting from a to z. I have wrote a function (named 'value_length') that returns a list of each word's length inside a certain list. I also applied the Counter module in a dictionary (that has the lengths of word as keys, frequency of those lengths as values) to solve the problem.
from collections import Counter
def value_length(seq):
'''This function takes a sequence and returns a list that contains
the length of each element
'''
value_l = []
for i in range(len(seq)):
length = len(seq[i])
value_l.append(length)
print(value_l)
# open the txt file
fileobj = open("wordlist.txt", "r")
file_content = []
# create a list with length of every single word
for line in fileobj:
file_content.append(line)
wordlist_lengths = value_length(file_content)
# create a dictionary that has the number of occurrence of each length as key
occurrence = {x:file_content.count(x) for x in file_content}
c = Counter(occurrence)
c.most_common(10)
But whenever I run this code, I do not get the result I desired; I only get the outcome from the value_length function (ie an extremely long list that has the length of each word). In other words, Python does not interpret the dictionary. I do not understand what my mistake is.
There's no need to store the lengths in a list, or to use the list's count
method; you've imported Counter
already, so just use that to do the counting.
c = Counter()
for word in seq:
length = len(word)
c[length] += 1
This code will find the lengths of each list item and sort them. Then you can simply make a tuple out of the occurance + count of occurance in list:
words = ["Hi", "bye", "hello", "what", "no", "crazy", "why", "say", "imaginary"]
lengths = [len(w) for w in words]
print(lengths)
sortedLengths = sorted(lengths)
print(sortedLengths)
countedLengths = [(w, sortedLengths.count(w)) for w in sortedLengths]
print(countedLengths)
This prints:
[2, 3, 5, 4, 2, 5, 3, 3, 9]
[2, 2, 3, 3, 3, 4, 5, 5, 9]
[(2, 2), (2, 2), (3, 3), (3, 3), (3, 3), (4, 1), (5, 2), (5, 2), (9, 1)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.