简体   繁体   English

用空格分割列表列表中的字符串

[英]Splitting strings inside a list of lists by spaces

Suppose I have the following structure:假设我有以下结构:

t = [['I will','take','care'],['I know','what','to','do']]

As you see in the first list I have 'I will' and I want them to split into two elements 'I' and 'will' , st the result is:正如您在第一个列表中看到的那样,我有'I will'并且我希望它们分成两个元素'I''will' ,结果是:

[['I', 'will', 'take', 'care'], ['I', 'know', 'what', 'to', 'do']]

Quick and dirty algorithm is the following:快速而肮脏的算法如下:

train_text_new = []


for sent in t:
  new = []
  for word in sent:
    temp = word.split(' ')
    for item2 in temp:
      new.append(item2)


  train_text_new.append(new)

But I would like to know if there is a more readable and maybe more efficient algorithm for solving this problem.但我想知道是否有更易读、可能更有效的算法来解决这个问题。

You could make a simple generator that yields the splits, then use that in a list comprehension:您可以制作一个简单的生成器来产生拆分,然后在列表理解中使用它:

t = [['I will','take','care'],['I know','what','to','do']]

def splitWords(l):
    for words in l:
        yield from words.split()

[list(splitWords(sublist)) for sublist in t]
# [['I', 'will', 'take', 'care'], ['I', 'know', 'what', 'you', 'to', 'do']]

You can try this.你可以试试这个。 Assuming splitting always happens to the first element of the sublist假设拆分总是发生在子列表的第一个元素上

t = [['I will','take','care'],['I know','what','to','do']]
[start.split()+rest for start,*rest in t]
# [['I', 'will', 'take', 'care'], ['I', 'know', 'what', 'to', 'do']]

If splitting should happen to any word in sublist try this.如果拆分应该发生在子列表中的任何单词上,试试这个。

[[j for i in lst for j in i.split()]for lst in t]
# [['I', 'will', 'take', 'care'], ['I', 'know', 'what', 'to', 'do']]

joining every inner list to a string usin join and splitting that string using split to list will do the trick使用join将每个内部列表连接到一个字符串并使用split to list 拆分该字符串就可以了

t = [['I will','take','care'],['I know','what','to','do']]
res = [' '.join(i).split() for i in t]
print(res)
# output [['I', 'will', 'take', 'care'], ['I', 'know', 'what', 'to', 'do']]

You could use itertools.chain.from_iterable to do the flattening after splitting:您可以使用itertools.chain.from_iterable在拆分后进行展平:

from itertools import chain

t = [['I will','take','care'],['I know','what','to','do']]

print([list(chain.from_iterable(x.split() for x in y)) for y in t])

Output: Output:

[['I', 'will', 'take', 'care'], ['I', 'know', 'what', 'to', 'do']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM