简体   繁体   English

将字符串拆分为内部列表而不影响关系

[英]split string into inner list without affecting relations

I have a list of lists, lol :我有一个列表列表, lol

[ ['filiabus', 'filia +N +Abl +Sg', 'filia +N +Dat +Sg'], 
  ['canēs', 'canis +N +Acc +Pl', 'canis +N +Abl +Pl'], ...] 

Each of the inner lists has 3 elements, all of which are currently strings.每个内部列表都有 3 个元素,所有这些元素当前都是字符串。 What I want to do is split the second and third items by a space character to create something like this:我想要做的是用空格字符拆分第二个和第三个项目以创建如下内容:

[ 
['filiabus', ['filia', '+N', '+Abl', '+Sg'], ['filia', '+N', '+Dat', '+Sg'] ], 
...
] 

It's important that these new nested lists be a part of the same list which has a first item (eg filiabus ).重要的是,这些新的嵌套列表是具有第一项的同一列表的一部分(例如filiabus )。 The first element can be a list unto itself if it makes it easier.如果它更容易,第一个元素可以是一个单独的列表。

I feel like something like this should work我觉得这样的事情应该有效

test=[]
for i in lol:
    for j in i:
        test.append([j[0],j[1].split(' '), j[2].split(' ')])

but it just produces:但它只会产生:

>>> test
[['f', ['i'], ['l']], ['f', ['i'], ['l']], ['f', ['i'], ['l']], ['c', ['a'], ['n']], ['c', ['a'], ['n']], ['c', ['a'], ['n']]]

Thanks!

As your initial list, contains 3-length lists, you can directly unpack them in 3 variables like作为您的初始列表,包含 3 个长度的列表,您可以直接将它们解压缩为 3 个变量,例如

 for name, v1, v2 in values:

Then, the result is just the first value, and the 2 other both splitted (no separator splits on 'consecutive spaces')然后,结果只是第一个值,另外 2 个都被拆分(在“连续空格”上没有分隔符拆分)

values = [['filiabus', 'filia +N +Abl +Sg', 'filia +N +Dat +Sg'],
          ['canēs', 'canis +N +Acc +Pl', 'canis +N +Abl +Pl']]

result = [[name, v1.split(), v2.split()] for name, v1, v2 in values]

print(result)  # [['filiabus', ['filia', '+N', '+Abl', '+Sg'], ['filia', '+N', '+Dat', '+Sg']], 
                # ['canēs', ['canis', '+N', '+Acc', '+Pl'], ['canis', '+N', '+Abl', '+Pl']]]

You can do it like this, for lists of any size:对于任何大小的列表,您都可以这样做:

lol = [['filiabus', 'filia +N +Abl +Sg', 'filia +N +Dat +Sg'], 
       ['canēs', 'canis +N +Acc +Pl', 'canis +N +Abl +Pl']]


def transform(sublist):
    first, *others = sublist
    return [first, *(item.split() for item in others)]

out = [transform(sublist) for sublist in lol]

print(out)
# [['filiabus', ['filia', '+N', '+Abl', '+Sg'], ['filia', '+N', '+Dat', '+Sg']], 
#  ['canēs', ['canis', '+N', '+Acc', '+Pl'], ['canis', '+N', '+Abl', '+Pl']]]

I believe this is what you are trying to do.我相信这就是你想要做的。

lol = [ ['filiabus', 'filia +N +Abl +Sg', 'filia +N +Dat +Sg'], 
  ['canēs', 'canis +N +Acc +Pl', 'canis +N +Abl +Pl']] 

# Iterate through each list in lol
for i in range(len(lol)):
    # Iterate through each string in the list
    for j in range(len(lol[i])):
        # Only split if string contains a space
        if " " in lol[i][j]:
            # Reassign position
            lol[i][j] = lol[i][j].split(" ")
        
print(lol)
# Prints 
#[['filiabus', ['filia', '+N', '+Abl', '+Sg'], ['filia', '+N', '+Dat', '+Sg']], 
#['canēs', ['canis', '+N', '+Acc', '+Pl'], ['canis', '+N', '+Abl', '+Pl']]]

The critical thing to do is, instead of iterating through each element, iterate through each INDEX of each element.要做的关键是,不是遍历每个元素,而是遍历每个元素的每个索引。 This allows you to reassign to the element's position.这允许您重新分配给元素的 position。

Also, note that when you tried to access j[i] in your example, you are accessing each letter, not each word.另外,请注意,当您在示例中尝试访问j[i]时,您访问的是每个字母,而不是每个单词。 Each word would be i[x] .每个单词都是i[x]

This answer is the more readable version;这个答案是更易读的版本; for a shorter list comprehension version, check out azro's answer.如需更短的列表理解版本,请查看 azro 的答案。

The second for is in excess.第二for是多余的。 Explanation: the variable i will cycle through the lists of lol , while the variable j will cycle through the words of each list.解释:变量i将循环遍历lol的列表,而变量j将循环遍历每个列表的单词。

So for example, in the first iteration you will have:例如,在第一次迭代中,您将拥有:

i=['filiabus', 'filia +N +Abl +Sg', 'filia +N +Dat +Sg']
j='filiabus'

You can now see that j[0] = 'f' and j[1] = 'i' (so j[1].split(' ') = ['i'] ).您现在可以看到j[0] = 'f'j[1] = 'i' (所以j[1].split(' ') = ['i'] )。

Solution: remove the inner loop:解决方法:去掉内循环:

test = []
for i in lol:
    test.append([i[0],i[1].split(' '), i[2].split(' ')])

or a more elegant (and pythonic) way:或更优雅(和pythonic)的方式:

test = [[i[0], i[1].split(), i[2].split()] for i in lol]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM