更有效地使用itertools.groupby（）

Question

I am trying to enhance my knowledge of the itertools library, since it's so generally useful. 我正在努力增强我对itertools库的了解，因为它通常很有用。 To that end, I'm trying to solve an interview puzzler I came across. 为此，我试图解决我遇到的一个采访益智游戏。 A large portion of it involves sequentially counting the number of grouped and repeated digits within a number. 其中很大一部分涉及顺序计算一个数字内的分组和重复数字的数量。 For example, for the number: 例如，对于数字：

1223444556

I want: 我想要：

[(1,1),(2,2),(1,3),(3,4),(2,5),(1,6)]

which is to say, from left to right, there is 1 one, 2 twos, 1 three, and so forth. 也就是说，从左到右，有1个，2个，2个，1个，等等。

Here is my current code: 这是我目前的代码：

from itertools import groupby
groups_first = [int(''.join(v)[0]) for k,v in groupby(str(1223444556))]
counts = [len(''.join(v)) for k,v in groupby(str(1223444556))]
zip(counts,groups_first)

It works, but what I would like to know is whether there is a more compact way of doing this that bypasses zipping two lists together. 它有效，但我想知道的是，是否有一种更紧凑的方式来绕过两个列表。 Any thoughts? 有什么想法吗？ I think this may go to doing some sort of lambda function in groupby(), but I can't see it yet. 我想这可能会在groupby（）中做一些lambda函数，但我还没看到它。

Thanks! 谢谢！

Answer 1

I'd probably just write 我可能只是写

>>> n = 1223444556
>>> [(len(list(g)), int(k)) for k,g in groupby(str(n))]
[(1, 1), (2, 2), (1, 3), (3, 4), (2, 5), (1, 6)]

Answer 2

怎么样：

[(sum(1 for _ in v), int(k)) for k,v in groupby(str(1223444556))]

Answer 3

I'd probably opt for collections instead: 我可能会选择收藏：

>>> from collections import Counter
>>> c = Counter('1223444556')
>>> c.items()
[('1', 1), ('3', 1), ('2', 2), ('5', 2), ('4', 3), ('6', 1)]

if the order is important (as you say in your comment), this may no longer be the most efficient method. 如果订单很重要（正如您在评论中所说），这可能不再是最有效的方法。 But for full consideration, you could do this: 但要充分考虑，你可以这样做：

>>> t = c.items()
>>> t = sorted(t)

And if you want y, x to be listed as x, y, you could then do this: 如果你想要y，x被列为x，y，你可以这样做：

>>> t = [(y, x) for x, y in t]
>>> print t
[(1, '1'), (2, '2'), (1, '3'), (3, '4'), (2, '5'), (1, '6')]

One value of this method is that the repeated element is listed as a string, so there is no confusion about which number comes from the original list and which number indicates frequency. 此方法的一个值是重复元素被列为字符串，因此不会混淆哪个数字来自原始列表，哪个数字表示频率。

更有效地使用itertools.groupby（）

问题描述

3 个解决方案

解决方案1
2 2013-01-31 04:11:23

解决方案2
2 已采纳 2013-01-31 04:11:25

解决方案3
1 2013-01-31 04:13:29

更有效地使用itertools.groupby（）

问题描述

3 个解决方案

解决方案1 2 2013-01-31 04:11:23

解决方案2 2 已采纳 2013-01-31 04:11:25

解决方案3 1 2013-01-31 04:13:29

解决方案1
2 2013-01-31 04:11:23

解决方案2
2 已采纳 2013-01-31 04:11:25

解决方案3
1 2013-01-31 04:13:29