简体   繁体   English

根据Python中列表中包含的句子短语拆分字符串

[英]Split a string according to the sentence phrases contained in a list in Python

I want to split a sentence according to the phrases that coincide with the list.我想根据与列表一致的短语拆分一个句子。 Eg:例如:

sentence = "the world is too big"
list = ["too big", "too small", "the world", "too many"]

If the sentence contains a phrase that belongs to the list, the decomposition result it wants to see is:如果句子中包含一个属于列表的词组,它想看到的分解结果是:

result = ["the world", "is", "too big"]

Instead of:代替:

result = ["the", "world", "is", "too", "big"]

Thank you so much太感谢了

sentence = "the world is too big"
list = ["too big", "too small", "the world", "too many"]
for data in list:
  if data in sentence:
   sentence = sentence.replace(data, data.replace(' ', '..............'))
sentence = sentence.split(' ')
for i in range(len(sentence)):
 if '..............' in sentence[i]:
  sentence[i] = sentence[i].replace('..............', ' ')
print(sentence)

Use re.split :使用re.split

import re

s = "the world is too big"
l = ["too big", "too small", "the world", "too many"]
r = fr"\s*({'|'.join(l)})\s*"
>>> re.split(r, s)[1:-1]
['the world', 'is', 'too big']

>>> r
'\\s*(too big|too small|the world|too many)\\s*'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM