简体   繁体   中英

Get index value from categorized List with itertools groupby

I have multiple list with tuple resulted from nltk.Freqdist() as follow :

totalist[0] = [('A',12),('C',1)] #index 0
totalist[1] = [('A',25),('X',3)] #index 1
totalist[2] = [('Z',3),('T',2)] #index 2
totalist[3] = [('Z',10),('M',8)] #index 3
totalist[4] = [('Z',8),('M',8)] #index 4
totalist[5] = [('C',10),('M',8)] #index 5

I want to get old index value even after this get grouped by groupby :

This is my code so far, but it wont work, it cant show index because of changing index from group by :

for key, group in groupby(totalist, lambda x: x[0][0]):
    for thing in group:
        #it should print it's old index value here 
    print(" ")

Is there any python way to solve this? Thanks in advance.

Assuming already sorted list

groupby assumes that the list is already sorted. The example data satisfy this assumption. You can use enumerate to preserve the original index and modify you key function accordingly:

for key, group in groupby(enumerate(totalist), lambda x: x[1][0][0]):
    print(key)
    for temp_thing in group:
        old_index, thing = temp_thing
        print('    ', old_index, thing)

Output:

A
     0 [('A', 12), ('C', 1)]
     1 [('A', 25), ('X', 3)]
Z
     2 [('Z', 3), ('T', 2)]
     3 [('Z', 10), ('M', 8)]
     4 [('Z', 8), ('M', 8)]
C
     5 [('C', 10), ('M', 8)]

Assuming an unsorted list

This is a modified solution, if you need to sort your list first. Best is to write one function that will be used for the sorting and the grouping:

def key_function(x):
    return x[1][0][0]

Now, use this function twice to get consistent sorting and grouping:

for key, group in groupby(sorted(enumerate(totalist), key=key_function), key_function):
    print(key)
    for temp_thing in group:
        old_index, thing = temp_thing
        print('    old index:', old_index)
        print('    thing:', thing)

Output:

A
    old index: 0
    thing: [('A', 12), ('C', 1)]
    old index: 1
    thing: [('A', 25), ('X', 3)]
C
    old index: 5
    thing: [('C', 10), ('M', 8)]
Z
    old index: 2
    thing: [('Z', 3), ('T', 2)]
    old index: 3
    thing: [('Z', 10), ('M', 8)]
    old index: 4
    thing: [('Z', 8), ('M', 8)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM