Python：如何根据特定元素拆分列表

Question

If we have the following list in Python如果我们在 Python 中有以下列表

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]

How do I split this to get a list which contains elements that finish with the full stop?如何拆分它以获取包含以句号结尾的元素的列表？ So i want to get the following elements in my new list:所以我想在我的新列表中获取以下元素：

["I","am","good","."]
["I","like","you","."]
["we","are","not","friends","."]

My attempts so far:到目前为止我的尝试：

cleaned_sentence = []
a = 0
while a < len(sentence):
    current_word = sentence[a]
    if current_word == "." and len(cleaned_sentence) == 0:
        cleaned_sentence.append(sentence[0:sentence.index(".")+1])
        a += 1
    elif current_word == "." and len(cleaned_sentence) > 0:
        sub_list = sentence[sentence.index(".")+1:-1]
        sub_list.append(sentence[-1])
        cleaned_sentence.append(sub_list[0:sentence.index(".")+1])
        a += 1
    else:
        a += 1

for each in cleaned_sentence:
    print(each)

Running this on sentence produces在sentence上运行它会产生

['I', 'am', 'good', '.']
['I', 'like', 'you', '.']
['I', 'like', 'you', '.']

Answer 1

You can use itertools.groupby :您可以使用itertools.groupby ：

from itertools import groupby
i = (list(g) for _, g in groupby(sentence, key='.'.__ne__))
print([a + b for a, b in zip(i, i)])

This outputs:这输出：

[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]

If your list doesn't always end with '.'如果您的列表并不总是以'.'结尾then you can use itertools.zip_longest instead:那么您可以改用itertools.zip_longest ：

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends"]
i = (list(g) for _, g in groupby(sentence, key='.'.__ne__))
print([a + b for a, b in zip_longest(i, i, fillvalue=[])])

This outputs:这输出：

[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends']]

Answer 2

We can do this in two stages: first calculating the indices where the dots are located, and then making slices, like:我们可以分两个阶段进行：首先计算点所在的索引，然后制作切片，例如：

idxs = [i for i, v in enumerate(sentence, 1) if v == '.']   # calculating indices

result = [sentence[i:j] for i, j in zip([0]+idxs, idxs)]    # splitting accordingly

This then yields:然后产生：

>>> [sentence[i:j] for i, j in zip([0]+idxs, idxs)]
[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]

You can then for example print the individual elements with:然后，您可以使用以下方法打印单个元素：

for sub in [sentence[i:j] for i, j in zip([0]+idxs, idxs)]:
    print(sub)

This then will print:这将打印：

>>> idxs = [i for i, v in enumerate(sentence, 1) if v == '.']
>>> for sub in [sentence[i:j] for i, j in zip([0]+idxs, idxs)]:
...     print(sub)
...
['I', 'am', 'good', '.']
['I', 'like', 'you', '.']
['we', 'are', 'not', 'friends', '.']

Answer 3

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]

output = []
temp = []
for item in sentence:
    temp.append(item)
    if item == '.':
        output.append(temp)
        temp = []
if temp:
    output.append(temp)

print(output)

Answer 4

Using a simple iteration.使用简单的迭代。

Demo:演示：

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]
last = len(sentence) - 1
result = [[]]
for i, v in enumerate(sentence):
    if v == ".":
        result[-1].append(".")
        if i != last:
            result.append([])
    else:
        result[-1].append(v)
print(result)

Output:输出：

[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]

Answer 5

This answer aims to be the simplest one...这个答案旨在成为最简单的答案......

The data数据

sentences = ["I", "am", "good", ".",
            "I", "like", "you", ".",
            "We", "are", "not", "friends", "."]

We initialize the output list and represent that we are start ing a new sentence我们初始化输出列表并表示我们正在开始一个新句子

l, start = [], 1

We loop on the data list, using w to address the current word我们在数据列表上循环，使用w来寻址当前单词

if we are at the start of a new sentence we clear the flag and add an empty list to the tail of the output list如果我们在一个新句子的开头，我们清除标志并在输出列表的尾部添加一个空列表
we append the current word to the last sublist (note that ① we are guaranteed that there is at least a last sublist (do you like alliterations?) and ② every word gets appended)我们将当前单词附加到最后一个子列表（注意①我们保证至少有一个最后一个子列表（你喜欢头韵吗？）并且②每个单词都被附加）
if we are at the end — we have met a "."如果我们在最后——我们遇到了一个"." — we raise again the flag. ——我们再次升旗。

Note the single comment…请注意单个评论...

for w in sentences:
    if start: start = l.append([]) # l.append() returns None, that is falsey...
    l[-1].append(w)
    if w == ".": start = 1

Answer 6

You could do this by joining the elements together into a string and then splitting the string back again using a regex:您可以通过将元素连接在一起形成一个字符串，然后使用正则表达式再次拆分该字符串来做到这一点：

import re

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]
result = [m.split('\0') for m in re.findall(r'(?<=\0).*?\.(?=\0|$)', '\0'.join(['.']+sentence))]

Output:输出：

[
 ['I', 'am', 'good', '.'],
 ['I', 'like', 'you', '.'],
 ['we', 'are', 'not', 'friends', '.']
]

Python：如何根据特定元素拆分列表

问题描述

6 个解决方案

解决方案1
16 已采纳 2018-10-01 12:24:01

解决方案2
4 2018-10-01 12:23:46

解决方案3
2 2018-10-01 12:30:28

解决方案4
1 2018-10-01 12:20:25

解决方案5
0 2018-10-01 12:55:35

解决方案6
0 2022-06-13 00:57:05

Python：如何根据特定元素拆分列表

问题描述

6 个解决方案

解决方案1 16 已采纳 2018-10-01 12:24:01

解决方案2 4 2018-10-01 12:23:46

解决方案3 2 2018-10-01 12:30:28

解决方案4 1 2018-10-01 12:20:25

解决方案5 0 2018-10-01 12:55:35

解决方案6 0 2022-06-13 00:57:05

解决方案1
16 已采纳 2018-10-01 12:24:01

解决方案2
4 2018-10-01 12:23:46

解决方案3
2 2018-10-01 12:30:28

解决方案4
1 2018-10-01 12:20:25

解决方案5
0 2018-10-01 12:55:35

解决方案6
0 2022-06-13 00:57:05