将元组列表的列表转换为python中的列表列表的元组

Question

I am writing a program for NER tagging with nltk and Mallet. 我正在编写一个使用nltk和Mallet进行NER标签的程序。 I have to convert between two formats of input data that I cannot change. 我必须在无法更改的两种输入数据格式之间进行转换。

The data basically contains words with their associated tags for supervised learning, but there's a subdivision of the data into sentences, hence the list of lists. 数据基本上包含带有相关标签的单词，以进行监督学习，但是将数据细分为句子，因此也列出了列表。

The first format is 第一种格式是

tuple(list(list(word)),list(list(tag)))

and the second format is 第二种格式是

list(list(tuple(word,tag))

Currently I am converting it like this (format 2 => format 1): 目前，我正在像这样转换它（格式2 =>格式1）：

([[tup[0] for tup in sent] for sent in train_set],
 [[tup[1] for tup in sent] for sent in train_set])

Sample data: 样本数据：

 [[('Steve','PERSON'),('runs','NONE'),('Apple','ORGANIZATION')],[('Today','NONE'),('is','NONE'),('June','DATETIME'),('27th','DATETIME')]]

and expected output: 和预期的输出：

 ([['Steve', 'runs', 'Apple' ],['Today','is','June','27th']],
  [['PERSON','NONE','ORGANIZATION'],['NONE','NONE','DATETIME','DATETIME']])

I perform conversion in both directions 我双向执行转换

EDIT: I don't necessarily want it to be shorter - please just suggest a better (and more readable) way of doing it in python 2.7 (with code sample). 编辑：我并不一定希望它更短-请在python 2.7（带有代码示例）中提出一种更好的方法（并且更具可读性）。

Answer 1

Converting list(list(tuple(word,tag)) to tuple(list(list(word)),list(list(tag))) is easy: 将list(list(tuple(word,tag))为tuple(list(list(word)),list(list(tag)))很容易：

def convert(data_structure):
     sentences, tags = data_structure
     container = []
     for i in xrange(len(sentences)):
         container.append(zip(sentences[i], tags[i]))

     return container

The code for converting into the other direction is a bit longer but not very complicated if you simply use nested for loops: 如果仅使用嵌套的for循环，则用于转换为另一个方向的代码会更长一些，但不会非常复杂：

def convert(data_structure):
    sentences = []
    tags = []

    for sentence in data_structure:
        sentence_words = []
        sentence_tags = []

        for word, tag in sentence:
            sentence_words.append(word)
            sentence_tags.append(tag)

        sentences.append(sentence_words)
        tags.append(sentence_tags)

    return (sentences, tags)

Perhaps the code can be shortened more but the general principle should be clear, hopefully. 也许可以进一步缩短代码，但希望总的原则应该清楚。

Answer 2

You can convert the inner tuples to iterators(using iter ) and then call next on them in a nested list comprehension: 您可以将内部的元组转换为迭代器（使用iter ），然后调用next对他们在嵌套列表理解：

lis = [[('Steve','PERSON'),('runs','NONE'),('Apple','ORGANIZATION')],
       [('Today','NONE'),('is','NONE'),('June','DATETIME'),('27th','DATETIME')]]

it = [[iter(y) for y in x] for x in lis]
n = len(lis[0][0])  #Number of iterations required.
print [[[next(x) for x in i] for i in it] for _ in range(n)]

Output: 输出：

[[['Steve', 'runs', 'Apple'], ['Today', 'is', 'June', '27th']],
 [['PERSON', 'NONE', 'ORGANIZATION'], ['NONE', 'NONE', 'DATETIME', 'DATETIME']]]

Answer 3

I think the correct solution will be this one: 我认为正确的解决方案是：

>>> data = [[('Steve','PERSON'),('runs','NONE'),('Apple','ORGANIZATION')],[('Today','NONE'),('is','NONE'),('June','DATETIME'),('27th','DATETIME')]]
>>> tuple([ map(list, (zip(*x))) for x in data ])
([['Steve', 'runs', 'Apple'], ['PERSON', 'NONE', 'ORGANIZATION']], [['Today', 'is', 'June', '27th'], ['NONE', 'NONE', 'DATETIME', 'DATETIME']])

将元组列表的列表转换为python中的列表列表的元组

问题描述

3 个解决方案

解决方案1
2 已采纳 2014-01-03 17:08:43

解决方案2
1 2014-01-03 15:40:05

解决方案3
0 2014-01-03 15:48:28

将元组列表的列表转换为python中的列表列表的元组

问题描述

3 个解决方案

解决方案1 2 已采纳 2014-01-03 17:08:43

解决方案2 1 2014-01-03 15:40:05

解决方案3 0 2014-01-03 15:48:28

解决方案1
2 已采纳 2014-01-03 17:08:43

解决方案2
1 2014-01-03 15:40:05

解决方案3
0 2014-01-03 15:48:28