I'm looking for a solution how to limit group size of a group created by itertools.groupby
.
Currently I have something like this:
>>> s = '555'
>>> grouped = groupby(s)
>>> print([(k, len(list(g))) for k, g in grouped])
[('5', 3)]
What I would like achieve is to have a max group's size=2, so my output would be:
[('5', 2), ('5', 1)]
Is there any easy and efficient way to do this? Maybe somehow by key
argument provided to groupby
?
Here is a solution using groupby
and a defaultdict
.
from itertools import groupby
from collections import defaultdict
s = "5555444"
desired_length = 2
counts = defaultdict(int)
def count(x):
global counts
c = counts[x]
counts[x] += 1
return c
grouped = groupby(s, key=lambda x: (x, count(x) // desired_length))
print([(k[0], len(list(g))) for k, g in grouped])
I honestly think this solution is unacceptable, as it requires that you keep track of the global state at all times, but here it is. I would personally just use a buffer-like thing.
from collections import defaultdict
s = "5555444"
def my_buffer_function(sequence, desired_length):
buffer = defaultdict(int)
for item in sequence:
buffer[item] += 1
if buffer[item] == desired_length:
yield (item, buffer.pop(item))
for k, v in buffer.items():
yield k, v
print(list(my_buffer_function(s, 2)))
This is also a generator. But it might miss some things groupby has that you currently rely on.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.