简体   繁体   中英

Reduce list of list to dictionary with sublist size as keys and number of occurances as value

I have a list of lists and I want to count the number of times a sublist with a specific size occurs.

eg. for list [[1], [1,2], [1,2], [1,2,3]] I expect to get {1: 1, 2: 2, 3: 1}

I've tried reduce function but I have syntax error on += 1 and have no idea what is wrong.

list_of_list = [[1], [1,2], [1,2], [1,2,3]]
result = functools.reduce(lambda dict,list: dict[len(list)] += 1, list_of_list, defaultdict(lambda: 0, {}))

It is not a good idea to use reduce in such a complicated way when you can use collections.Counter() with map() function in a more Pythonic way:

>>> A = [[1], [1,2], [1,2], [1,2,3]]
>>> from collections import Counter
>>> 
>>> Counter(map(len,A))
Counter({2: 2, 1: 1, 3: 1})

Note that using map will perform slightly better than a generator expression because by passing a generator expression to Counter() python will get the values from generator function by itself, since using built-in function map has more performance in terms of execution time 1 .

~$ python -m timeit --setup "A = [[1], [1,2], [1,2], [1,2,3]];from collections import Counter" "Counter(map(len,A))"
100000 loops, best of 3: 4.7 usec per loop
~$ python -m timeit --setup "A = [[1], [1,2], [1,2], [1,2,3]];from collections import Counter" "Counter(len(x) for x in A)"
100000 loops, best of 3: 4.73 usec per loop

From PEP 0289 -- Generator Expressions :

The semantics of a generator expression are equivalent to creating an anonymous generator function and calling it. For example:

 g = (x**2 for x in range(10)) print g.next() 

is equivalent to:

 def __gen(exp): for x in exp: yield x**2 g = __gen(iter(range(10))) print g.next() 

Note that since generator expressions are better in terms of memory use, if you are dealing with large data you'd better use generator expression instead of map function.

You can do this using Counter as well:

list_of_list = [[1], [1,2], [1,2], [1,2,3]]
c = Counter(len(i) for i in list_of_list)

Output:

Counter({2: 2, 1: 1, 3: 1})

reduce is an inferior tool for this job.

Look at a collections.Counter instead. It's a dict subclass, so you should be able to use it however you were planning to use the dict.

>>> from collections import Counter
>>> L = [[1], [1, 2], [1, 2], [1, 2, 3]]
>>> Counter(len(x) for x in L)
Counter({1: 1, 2: 2, 3: 1})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM