简体   繁体   English

使用itertools groupby从分类列表中获取索引值

[英]Get index value from categorized List with itertools groupby

I have multiple list with tuple resulted from nltk.Freqdist() as follow : 我有多个由nltk.Freqdist()生成的元组列表,如下所示:

totalist[0] = [('A',12),('C',1)] #index 0
totalist[1] = [('A',25),('X',3)] #index 1
totalist[2] = [('Z',3),('T',2)] #index 2
totalist[3] = [('Z',10),('M',8)] #index 3
totalist[4] = [('Z',8),('M',8)] #index 4
totalist[5] = [('C',10),('M',8)] #index 5

I want to get old index value even after this get grouped by groupby : 我想得到旧的索引值,即使在按groupby分组后:

This is my code so far, but it wont work, it cant show index because of changing index from group by : 到目前为止,这是我的代码,但是将无法正常工作,因为通过group by更改了索引,因此无法显示索引:

for key, group in groupby(totalist, lambda x: x[0][0]):
    for thing in group:
        #it should print it's old index value here 
    print(" ")

Is there any python way to solve this? 有什么办法解决这个问题吗? Thanks in advance. 提前致谢。

Assuming already sorted list 假设已经排序列表

groupby assumes that the list is already sorted. groupby假定该列表已经排序。 The example data satisfy this assumption. 示例数据满足此假设。 You can use enumerate to preserve the original index and modify you key function accordingly: 您可以使用enumerate来保留原始索引并相应地修改键函数:

for key, group in groupby(enumerate(totalist), lambda x: x[1][0][0]):
    print(key)
    for temp_thing in group:
        old_index, thing = temp_thing
        print('    ', old_index, thing)

Output: 输出:

A
     0 [('A', 12), ('C', 1)]
     1 [('A', 25), ('X', 3)]
Z
     2 [('Z', 3), ('T', 2)]
     3 [('Z', 10), ('M', 8)]
     4 [('Z', 8), ('M', 8)]
C
     5 [('C', 10), ('M', 8)]

Assuming an unsorted list 假设一个未排序的列表

This is a modified solution, if you need to sort your list first. 如果您需要首先对列表进行排序,则这是一种经过修改的解决方案。 Best is to write one function that will be used for the sorting and the grouping: 最好是编写一个将用于排序和分组的函数:

def key_function(x):
    return x[1][0][0]

Now, use this function twice to get consistent sorting and grouping: 现在,两次使用此功能以获得一致的排序和分组:

for key, group in groupby(sorted(enumerate(totalist), key=key_function), key_function):
    print(key)
    for temp_thing in group:
        old_index, thing = temp_thing
        print('    old index:', old_index)
        print('    thing:', thing)

Output: 输出:

A
    old index: 0
    thing: [('A', 12), ('C', 1)]
    old index: 1
    thing: [('A', 25), ('X', 3)]
C
    old index: 5
    thing: [('C', 10), ('M', 8)]
Z
    old index: 2
    thing: [('Z', 3), ('T', 2)]
    old index: 3
    thing: [('Z', 10), ('M', 8)]
    old index: 4
    thing: [('Z', 8), ('M', 8)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM