简体   繁体   中英

How do I use most_common in a Python counter?

I have a function that aims to give me the bottom n percent occurring words from my data. This function is:

def bottomnpercent(table,n):

words=0
wordcounter=Counter()
for key, data in table.scan():
    if not key in stopwords:
        words+=1
        wordcounter[key]+= getsomedata 
        idx=percentage(n,words)
return Counter(wordcounter.most_common()[-idx:])

(table.scan loops though an HBASE table that has a word and a frequency count; getsomedata does a lookup that returns the count for a particular word).

The problem is this returns a counter of the form:

Counter({('stopped', 173): 1, ('thrilling', 17): 1, ('fluids', 18): 1, ('Pictures', 18): 1, ('steering', 37): 1,... 

which is no good as everything occurs 1 time and I need something like:

 Counter({('stopped'): 173, ('thrilling'): 17, ('fluids'): 18, ('Pictures'): 18, ('steering'): 37,... 

but I cannot figure out how. Any help is much appreciated. TIA!

Its because of that wordcounter is a counter ( wordcounter=Counter() ) and again you use it inside another counter return Counter(wordcounter.most_common()[-idx:]) ! you just need to return the following :

return wordcounter.most_common()[-idx:]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM