按文本顺序对列表中的句子进行排序

Question

我从Python中的文本中提取了一些句子。 文本存储在字符串中，句子存储在列表中。 这是一些示例输入：

text = "This is a text. This is sentence 1. Here is sentence 2. And this is sentence 3."
extracted = ['Here is sentence 2.', 'This is a text']

现在我想根据文本中的年表顺序从列表中extracted元素。 这是我想要的输出：

ordered_result = ['This is a text', 'Here is sentence 2.']

有人知道怎么做吗？
提前致谢。

Answer 1

直接按原始字符串中的位置对它们进行排序：

ordered_result = sorted(extracted, key=lambda x: text.index(x))

Answer 2

一种方法是使用字典来构造具有O（n）复杂度的索引映射。

然后使用自定义键sorted ，使用此词典。

这种方法依赖于开头的句子列表。 我已经在下面构建了一个，以防你没有这个。

text = "This is a text. This is sentence 1. Here is sentence 2. And this is sentence 3."

extracted = ['Here is sentence 2.', 'This is a text.']

# create list of sentences
full_list = [i.strip()+'.' for i in filter(None, text.split('.'))]

# map sentences to integer location
d_map = {v: k for k, v in enumerate(full_list)}

# sort by calculated location mapping
extracted_sorted = sorted(extracted, key=d_map.get)

['This is a text.', 'Here is sentence 2.']

Answer 3

首选（但稍微复杂一点）的方法是使用正则表达式搜索：

import re

expression = re.compile(r'([A-Z][^\.!?]*[\.!?])')
text = "This is a text. This is sentence 1. Here is sentence 2. And this is sentence 3."

# Find all occurences of `expression` in `text`
match = re.findall(expression, text)

print match
# ['This is a text.', 'This is sentence 1.', 'Here is sentence 2.', 'And this is sentence 3.']

执行此操作的简单（但更简单）方法是将其拆分为". "然后按时间顺序排列句子列表。 唯一的缺点是你丢失了标点符号。

text = "This is a text. This is sentence 1. Here is sentence 2. And this is sentence 3."
splitt = text.split(". ")

print splitt
# splitt = ['This is a text', 'This is sentence 1', 'Here is sentence 2', 'And this is sentence 3.']

按文本顺序对列表中的句子进行排序

问题描述

3 个解决方案

解决方案1
2 已采纳 2018-06-12 09:43:06

解决方案2
1 2018-06-12 09:42:55

解决方案3
0 2018-06-12 09:53:35

按文本顺序对列表中的句子进行排序

问题描述

3 个解决方案

解决方案1 2 已采纳 2018-06-12 09:43:06

解决方案2 1 2018-06-12 09:42:55

解决方案3 0 2018-06-12 09:53:35

解决方案1
2 已采纳 2018-06-12 09:43:06

解决方案2
1 2018-06-12 09:42:55

解决方案3
0 2018-06-12 09:53:35