简体   繁体   中英

python: grouping based on frequency of occurrence

I have labels and their frequencies(ie number of times they are repeated) for a dataset.

Is there a library which can be used to group together those labels which have almost similar frequency(ie based on variation).

As an example: Suppose a is repeated 10 times, b 9 times, c 6 times, d 5 times, e 2 times So I want and b fall into one group, c and d in one group and e in another group.

You can use the following function to group based upon count.

def group_labels(cnts): 
  d = {} 
  for k, v in cnts.items(): 
    d.setdefault(v, []).append(k)
  return sorted(d.values(), key=lambda x: x[0]) # sorted by first label

Example

cnts = {'a': 4, 'b': 15, 'c':4, 'd':16, 'e':1, 'f':16}
print(group_labels(cnts))
[['a', 'c'], ['b'], ['d', 'f'], ['e']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM