# 如何使用itertools.groupby（）？How do I use itertools.groupby()?

• 列出一个列表-在这种情况下，是一个对象化`lxml`元素的子元素
• 根据一些标准将其分为几组
• 然后，稍后分别遍历每个组。

## 12 个回复12

### ===============>>#1 票数：610 已采纳

``````groups = []
uniquekeys = []
for k, g in groupby(data, keyfunc):
groups.append(list(g))    # Store group iterator as a list
uniquekeys.append(k)
``````

`k`是当前的分组密钥，而`g`是一个迭代器，您可以使用该迭代器在该分组密钥定义的组上进行迭代。 换句话说， `groupby`迭代器本身返回迭代器。

``````from itertools import groupby

things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"), ("vehicle", "speed boat"), ("vehicle", "school bus")]

for key, group in groupby(things, lambda x: x[0]):
for thing in group:
print "A %s is a %s." % (thing[1], key)
print " "
``````

`groupby()`函数采用两个参数：（1）进行分组的数据和（2）进行分组的函数。

``````for key, group in groupby(things, lambda x: x[0]):
listOfThings = " and ".join([thing[1] for thing in group])
print key + "s:  " + listOfThings + "."
``````

### ===============>>#2 票数：73

Python文档上的示例非常简单：

``````groups = []
uniquekeys = []
for k, g in groupby(data, keyfunc):
groups.append(list(g))      # Store group iterator as a list
uniquekeys.append(k)
``````

### ===============>>#3 票数：48

`itertools.groupby`是用于对项目进行分组的工具。

`# [k for k, g in groupby('AAAABBBCCDAABBB')] --> ABCDAB`

`# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D`

`groupby`对象产生密钥组对，其中组是生成器。

• A.将连续的项目组合在一起
• B.将所有出现的项归为一组，并给出可排序的
• C.指定如何使用按键功能对项目进行分组*

``````# Define a printer for comparing outputs
>>> def print_groupby(iterable, keyfunc=None):
...    for k, g in it.groupby(iterable, keyfunc):
...        print("key: '{}'--> group: {}".format(k, list(g)))
``````

``````# Feature A: group consecutive occurrences
key: 'B'--> group: ['B']
key: 'C'--> group: ['C']
key: 'A'--> group: ['A', 'A']
key: 'C'--> group: ['C']
key: 'A'--> group: ['A']
key: 'C'--> group: ['C']
key: 'A'--> group: ['A', 'A']
key: 'D'--> group: ['D']
key: 'B'--> group: ['B', 'B', 'B']

# Feature B: group all occurrences
key: 'A'--> group: ['A', 'A', 'A', 'A', 'A']
key: 'B'--> group: ['B', 'B', 'B', 'B']
key: 'C'--> group: ['C', 'C', 'C']
key: 'D'--> group: ['D']

# Feature C: group by a key function
>>> # keyfunc = lambda s: s.islower()                      # equivalent
>>> def keyfunc(s):
...     """Return a True if a string is lowercase, else False."""
...     return s.islower()
key: 'False'--> group: ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'D']
key: 'True'--> group: ['a', 'a', 'b', 'b', 'c']
``````

*传递和比较所有项目，影响结果的功能。 具有关键功能的其他对象包括`sorted()``max()``min()`

``````# OP: Yes, you can use `groupby`, e.g.
[do_something(list(g)) for _, g in groupby(lxml_elements, criteria_func)]
``````

### ===============>>#4 票数：39

groupby的一个绝妙技巧是在一行中运行长度编码：

``````[(c,len(list(cgen))) for c,cgen in groupby(some_string)]
``````

### ===============>>#5 票数：25

``````for key, igroup in itertools.groupby(xrange(12), lambda x: x // 5):
print key, list(igroup)
``````

``````0 [0, 1, 2, 3, 4]
1 [5, 6, 7, 8, 9]
2 [10, 11]
``````

``````def chunker(items, chunk_size):
'''Group items in chunks of chunk_size'''
for _key, group in itertools.groupby(enumerate(items), lambda x: x[0] // chunk_size):
yield (g[1] for g in group)

with open('file.txt') as fobj:
for chunk in chunker(fobj):
process(chunk)
``````

groupby的另一个示例-不对键进行排序时。 在以下示例中，xx中的项目按yy中的值分组。 在这种情况下，首先输出一组零，然后输出一组1，再输出一组零。

``````xx = range(10)
yy = [0, 0, 0, 1, 1, 1, 0, 0, 0, 0]
for group in itertools.groupby(iter(xx), lambda x: yy[x]):
print group[0], list(group[1])
``````

``````0 [0, 1, 2]
1 [3, 4, 5]
0 [6, 7, 8, 9]
``````

### ===============>>#6 票数：21

``````for x in list(groupby(range(10))):
print(list(x[1]))
``````

``````[]
[]
[]
[]
[]
[]
[]
[]
[]
[9]
``````

``````def groupbylist(*args, **kwargs):
return [(k, list(g)) for k, g in groupby(*args, **kwargs)]
``````

### ===============>>#7 票数：9

``````from itertools import groupby

things = [("vehicle", "bear"), ("animal", "duck"), ("animal", "cactus"), ("vehicle", "speed boat"), ("vehicle", "school bus")]

for key, group in groupby(things, lambda x: x[0]):
for thing in group:
print "A %s is a %s." % (thing[1], key)
print " "
``````

``````A bear is a vehicle.

A duck is a animal.
A cactus is a animal.

A speed boat is a vehicle.
A school bus is a vehicle.
``````

### ===============>>#8 票数：7

@CaptSolo，我尝试了您的示例，但没有成功。

``````from itertools import groupby
[(c,len(list(cs))) for c,cs in groupby('Pedro Manoel')]
``````

``````[('P', 1), ('e', 1), ('d', 1), ('r', 1), ('o', 1), (' ', 1), ('M', 1), ('a', 1), ('n', 1), ('o', 1), ('e', 1), ('l', 1)]
``````

``````name = list('Pedro Manoel')
name.sort()
[(c,len(list(cs))) for c,cs in groupby(name)]
``````

``````[(' ', 1), ('M', 1), ('P', 1), ('a', 1), ('d', 1), ('e', 2), ('l', 1), ('n', 1), ('o', 2), ('r', 1)]
``````

### ===============>>#9 票数：6

``````from itertools import groupby

val = [{'name': 'satyajit', 'address': 'btm', 'pin': 560076},
{'name': 'Mukul', 'address': 'Silk board', 'pin': 560078},
{'name': 'Preetam', 'address': 'btm', 'pin': 560076}]

for pin, list_data in groupby(sorted(val, key=lambda k: k['pin']),lambda x: x['pin']):
...     print pin
...     for rec in list_data:
...             print rec
...
o/p:

560076
{'name': 'satyajit', 'pin': 560076, 'address': 'btm'}
{'name': 'Preetam', 'pin': 560076, 'address': 'btm'}
560078
{'name': 'Mukul', 'pin': 560078, 'address': 'Silk board'}
``````

### ===============>>#10 票数：5

``````groupby(iterable[, keyfunc]) -> create an iterator which returns
(key, sub-iterator) grouped by each value of key(value).
``````

``````import itertools

def grouper(iterable, n):
def coroutine(n):
yield # queue up coroutine
for i in itertools.count():
for j in range(n):
yield i
groups = coroutine(n)
next(groups) # queue up coroutine

for c, objs in itertools.groupby(iterable, groups.send):
yield c, list(objs)
# or instead of materializing a list of objs, just:
# return itertools.groupby(iterable, groups.send)

list(grouper(range(10), 3))
``````

``````[(0, [0, 1, 2]), (1, [3, 4, 5]), (2, [6, 7, 8]), (3, [9])]
``````

### ===============>>#11 票数：1

``````from itertools import groupby

#user input

myinput = input()

#creating empty list to store output

myoutput = []

for k,g in groupby(myinput):

myoutput.append((len(list(g)),int(k)))

print(*myoutput)
``````

### ===============>>#12 票数：1

``````           def groupby(data):
kv = {}
for k,v in data:
if k not in kv:
kv[k]=[v]
else:
kv[k].append(v)
return kv

Run on ipython:
In [10]: data = [('a', 1), ('b',2),('a',2)]

In [11]: groupby(data)
Out[11]: {'a': [1, 2], 'b': [2]}
``````