简体   繁体   中英

Save consecutive indices in sequence

I'm analysing events that happen in a sequence like the example below. It shows a list of tuples which have elements about the type and the index in a dataframe. I want to save all indices if they belong to the same type as long as the type does not change sequentially.

l=[('question', 0),
   ('response', 1),
   ('response', 2),
   ('response', 3),
   ('response', 4),
   ('response', 5),
   ('response', 6),
   ('response', 7),
   ('response', 8),
   ('response', 9),
   ('response', 10),
   ('response', 11),
   ('question', 12),
   ('response', 13),
   ('response', 14),
   ('response', 15),
   ('question', 16),
   ('response', 17),
   ('question', 18),
   ('response', 19),
   ('question', 20),
   ('response', 21),
   ('question', 22)
  ]

desired output:

[('query', 0),
 ('response', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]),
 ('query', [12]),
 ('response', [13, 14, 15]),
 ('query', [16]),
 ('response', [17]),
 ('query', [18]),
 ('response', [19]),
 ('query', [20]),
 ('response', [21])]

Here is my solution. Is there a better way of doing this?

def fxn(listitem):
    newlist = None
    collected_items = []
    current_comm_type = listitem[0][0]
    for element in listitem:
        if len(collected_items) == 0:
            collected_items.append(listitem[0])
        elif element[0] == current_comm_type:
            newlist[1].extend([element[1]])
        else:
            if not newlist:
                current_comm_type = element[0]
                newlist = [current_comm_type]
                newlist.append([element[1]])
            else:
                collected_items.append(tuple(newlist))
                current_comm_type = element[0]
                newlist = [current_comm_type]
                newlist.append([element[1]])
            # collected_items.append(newlist)
    return collected_items

fxn(l)

Here's one way to do it with itertools.groupby and a list comprehension :

from itertools import groupby

r = [(k, [y for _, y in g]) for k, g in groupby(l, lambda x: x[0])]
print(r)
# [('question', [0]), ('response', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]), ('question', [12]), ('response', [13, 14, 15]), ('question', [16]), ('response', [17]), ('question', [18]), ('response', [19]), ('question', [20]), ('response', [21]), ('question', [22])]

Here is a solution as a generator:

def my_fxn(input_list):
    output = None
    for key, value in input_list:
        if output is None or key != output[0]:
            if output is not None:
                yield output
            output = (key, [value])
        else:
            output[1].append(value)
    yield output

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM