简体   繁体   中英

Itertools groupby to group list using another list

In pandas you can use an equivalent in length Series to GroupBy another one, for example:

s = pd.Series([1,1,1,-2,-4,-3,1,2])
g = np.sign(s).diff().fillna(0).abs().cumsum()
s.groupby(g).count()

0.0    3
2.0    3
4.0    2
dtype: int64

Is it possible to do the same using itertools.groupby ? Thus using another list to create groups from the current one? Or perhaps using some key? As long as it gives me an idea of how to solve this case [1,1,1,-2,-4,-3,1,2] to create groups according to the sings would be great.

Expected output:

[3,3,2]

You could do the following:

from itertools import groupby


data =  [1,1,1,-2,-4,-3,1,2]

result = [sum(1 for _ in group) for _, group in groupby(data, lambda x: x<= 0)]
print(result)

Output

[3, 3, 2]

The statement: sum(1 for _ in group) counts the number of elements in the group. The key lambda x: x<= 0 is the sign function.

For the general case of grouping one iterable based on the matching value in another iterable, you can just make a cheaty key function that iterates the other iterable, eg using your original s and g :

>>> from itertools import groupby
>>> print([(k, len(list(grp))) for k, grp in groupby(s, key=lambda _, ig=iter(g): next(ig))])
[(0.0, 3), (2.0, 3), (4.0, 2)]

The key function accepts the value from s and ignores it, instead returning the matching value from iterating g manually (the defaulted second argument caches an iterator created from g , then next is used to manually advance it each time; pass a second argument to next to silently ignore mismatched lengths and simply substitute in a default value).

Obviously, for this specific case there are better approaches , but I'm answering the general question asked, not the specific example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM