简体   繁体   English

基于其他列表列表对列表列表中的项目进行分组

[英]Grouping of items in a list of lists based on other list of lists

I have the following list of lists: 我有以下列表列表:

mylist = [['NNP', 'NN', 'VBZ', 'VBN', 'NNP', 'NNP'],
           ['VB', 'VBN'],
           ['NNP'],
           ['VB', 'NN'],
           ['NN', 'NN']]

I have one more list of lists: 我还有一个清单清单:

cond = [['NNP', 'NN'], ['VBZ', 'VBN', 'VB']]

I want to group the list of list items in mylist based on the lists in the cond list and get the following output. 我想基于cond列表中的列表对mylist中的列表项列表进行分组,并获得以下输出。

out = [['NNP', 'NN'], ['VBZ', 'VBN'], ['NNP', 'NNP'], ['VB', 'VBN'], ['NNP'], ['VB'], ['NN'], ['NN', 'NN']]

The items should be grouped in such a way that the list of list items in mylist should be part of only one list in cond ie, ['NN', 'VBZ'] or ['VBN', 'NNP'] is not expected in output. 这些项目应按以下方式进行分组:mylist中的列表项目列表应仅是cond中一个列表的一部分,即不应使用['NN','VBZ']或['VBN','NNP']在输出中。

This is not a case where I have to split a list when some item is encountered. 这不是遇到某些项目时我必须拆分列表的情况。

I went through many codes where lists are split based on condition, but my problem is different here. 我遍历了很多代码,这些代码根据条件对列表进行了拆分,但是这里的问题有所不同。 Hence it's not a duplicate question. 因此,这不是一个重复的问题。

I don't know the initial approach to take to start coding. 我不知道开始编码的最初方法。

Here's the best I could come up with: 这是我能想到的最好的方法:

import itertools

mylist = [['NNP', 'NN', 'VBZ', 'VBN', 'NNP', 'NNP'],
           ['VB', 'VBN'],
           ['NNP'],
           ['VB', 'NN'],
           ['NN', 'NN']]

cond = [['NNP', 'NN'], ['VBZ', 'VBN', 'VB']]

out = list()
for sublist in mylist:
    while sublist != []:
        match = list(filter(lambda x: x != [], [list(itertools.takewhile(lambda x: x in condition, sublist)) for condition in cond]))[0]
        out.append(match)
        sublist = sublist[len(match):]

print(out)

First, we iterate through all of the sub-lists. 首先,我们遍历所有子列表。 Then we use the itertools method takewhile to construct a list of elements that match any of the conditions in cond . 然后,我们使用itertools方法takewhile来构造一个与cond中任何条件匹配的元素列表。 Sometimes there will not be a matching set of elements for the given condition , so we filter out [] results. 有时,对于给定condition ,将没有一组匹配的元素,因此我们过滤掉[]结果。 Then we remove that amount of elements from the front of the list. 然后,我们从列表的开头删除该数量的元素。 We add the constructed list to our final list. 我们将构造的列表添加到最终列表中。 We then to the takewhile process again until that sublist has been exhausted. 然后,我们再次进入接takewhile过程,直到该sublist用尽为止。 We repeat the entire process for every sub-list in mylist . 我们对mylist中的每个子列表重复整个过程。

itertools is a very powerful library in python and you should familiarize yourself with it if you are working with lists or other iterables in python a lot. itertools是python中一个非常强大的库,如果您大量使用python中的列表或其他可迭代对象,则应该熟悉它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM