In pandas you can use an equivalent in length Series to GroupBy
another one, for example:
s = pd.Series([1,1,1,-2,-4,-3,1,2])
g = np.sign(s).diff().fillna(0).abs().cumsum()
s.groupby(g).count()
0.0 3
2.0 3
4.0 2
dtype: int64
Is it possible to do the same using itertools.groupby
? Thus using another list to create groups from the current one? Or perhaps using some key? As long as it gives me an idea of how to solve this case [1,1,1,-2,-4,-3,1,2]
to create groups according to the sings would be great.
Expected output:
[3,3,2]
You could do the following:
from itertools import groupby
data = [1,1,1,-2,-4,-3,1,2]
result = [sum(1 for _ in group) for _, group in groupby(data, lambda x: x<= 0)]
print(result)
Output
[3, 3, 2]
The statement: sum(1 for _ in group)
counts the number of elements in the group. The key lambda x: x<= 0
is the sign function.
For the general case of grouping one iterable based on the matching value in another iterable, you can just make a cheaty key
function that iterates the other iterable, eg using your original s
and g
:
>>> from itertools import groupby
>>> print([(k, len(list(grp))) for k, grp in groupby(s, key=lambda _, ig=iter(g): next(ig))])
[(0.0, 3), (2.0, 3), (4.0, 2)]
The key
function accepts the value from s
and ignores it, instead returning the matching value from iterating g
manually (the defaulted second argument caches an iterator created from g
, then next
is used to manually advance it each time; pass a second argument to next
to silently ignore mismatched lengths and simply substitute in a default value).
Obviously, for this specific case there are better approaches , but I'm answering the general question asked, not the specific example.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.