[英]Itertools groupby to group list using another list
In pandas you can use an equivalent in length Series to GroupBy
another one, for example: 在熊猫中,您可以使用长度相等的Series来将
GroupBy
另一个GroupBy
一起使用,例如:
s = pd.Series([1,1,1,-2,-4,-3,1,2])
g = np.sign(s).diff().fillna(0).abs().cumsum()
s.groupby(g).count()
0.0 3
2.0 3
4.0 2
dtype: int64
Is it possible to do the same using itertools.groupby
? 是否可以使用
itertools.groupby
做同样的事情? Thus using another list to create groups from the current one? 因此,使用另一个列表从当前列表创建组吗? Or perhaps using some key?
还是使用一些密钥? As long as it gives me an idea of how to solve this case
[1,1,1,-2,-4,-3,1,2]
to create groups according to the sings would be great. 只要能使我想到如何解决这种情况
[1,1,1,-2,-4,-3,1,2]
以根据唱歌创建组就很好了。
Expected output: 预期产量:
[3,3,2]
You could do the following: 您可以执行以下操作:
from itertools import groupby
data = [1,1,1,-2,-4,-3,1,2]
result = [sum(1 for _ in group) for _, group in groupby(data, lambda x: x<= 0)]
print(result)
Output 产量
[3, 3, 2]
The statement: sum(1 for _ in group)
counts the number of elements in the group. 语句:
sum(1 for _ in group)
计算sum(1 for _ in group)
中元素的数量。 The key lambda x: x<= 0
is the sign function. 密钥
lambda x: x<= 0
是符号函数。
For the general case of grouping one iterable based on the matching value in another iterable, you can just make a cheaty key
function that iterates the other iterable, eg using your original s
and g
: 对于根据匹配值将一个可迭代项分组为另一个可迭代项的一般情况,您可以使一个作弊
key
函数迭代另一个可迭代项,例如使用原始s
和g
:
>>> from itertools import groupby
>>> print([(k, len(list(grp))) for k, grp in groupby(s, key=lambda _, ig=iter(g): next(ig))])
[(0.0, 3), (2.0, 3), (4.0, 2)]
The key
function accepts the value from s
and ignores it, instead returning the matching value from iterating g
manually (the defaulted second argument caches an iterator created from g
, then next
is used to manually advance it each time; pass a second argument to next
to silently ignore mismatched lengths and simply substitute in a default value). key
函数从s
接受值并忽略它,而不是从手动迭代g
返回匹配值(默认的第二个参数缓存从g
创建的迭代器,然后每次使用next
手动将其前进;将第二个参数传递给next
以静默忽略不匹配的长度,而只是替换为默认值)。
Obviously, for this specific case there are better approaches , but I'm answering the general question asked, not the specific example. 显然,对于这种特定情况,有更好的方法 ,但是我在回答所问的一般问题,而不是特定示例。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.