简体   繁体   中英

Limitation of group size in itertools.groupby

I'm looking for a solution how to limit group size of a group created by itertools.groupby .

Currently I have something like this:

>>> s = '555'
>>> grouped = groupby(s)
>>> print([(k, len(list(g))) for k, g in grouped])
[('5', 3)]

What I would like achieve is to have a max group's size=2, so my output would be:

[('5', 2), ('5', 1)]

Is there any easy and efficient way to do this? Maybe somehow by key argument provided to groupby ?

Here is a solution using groupby and a defaultdict .

from itertools import groupby
from collections import defaultdict

s = "5555444"
desired_length = 2
counts = defaultdict(int)

def count(x):
    global counts
    c = counts[x]
    counts[x] += 1
    return c

grouped = groupby(s, key=lambda x: (x, count(x) // desired_length))
print([(k[0], len(list(g))) for k, g in grouped])

I honestly think this solution is unacceptable, as it requires that you keep track of the global state at all times, but here it is. I would personally just use a buffer-like thing.

from collections import defaultdict
s = "5555444"

def my_buffer_function(sequence, desired_length):
    buffer = defaultdict(int)
    for item in sequence:
        buffer[item] += 1
        if buffer[item] == desired_length:
            yield (item, buffer.pop(item))
    for k, v in buffer.items():
        yield k, v

print(list(my_buffer_function(s, 2)))

This is also a generator. But it might miss some things groupby has that you currently rely on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM