[英]Python: How to remove sentences starting with a specific word(s)
I am trying to remove whole sentences that start with a certain phrase, but I want to retain the rest of the body text.我正在尝试删除以某个短语开头的整个句子,但我想保留正文的 rest。 For example:例如:
text = "Hello I like dogs. I also like cats. Hello I like animals" text = "你好,我喜欢狗。我也喜欢猫。你好,我喜欢动物"
I want to remove any sentence that starts with "Hello" But retain the rest, therefore the function should only leave:我想删除任何以“Hello”开头的句子,但保留 rest,因此 function 应该只留下:
"I also like cats." “我也喜欢猫。”
Currently I am experimenting with regex expressions, but I am unsure of a way to achieve this.目前我正在尝试使用正则表达式,但我不确定实现这一点的方法。 Any help would be appreciated.任何帮助,将不胜感激。
Here is a basic approach.这是一个基本的方法。 You may need to use something more fancy in order to split the sentences;您可能需要使用更花哨的东西来拆分句子; see this post for more details.有关更多详细信息,请参阅此帖子。
>>> text = "Hello I like dogs. I also like cats. Hello I like animals"
>>> sentences = text.split(". ")
>>> ". ".join(s for s in sentences if not s.lower().startswith("hello")) + "."
'I also like cats.'
read the code notes plaese:请阅读代码注释:
text = "Hello I like dogs. I also like cats. Hello I like animals"
#get list of sentences, split by the DOT and space ". "
#like: ['Hello I like dogs', 'I also like cats', 'Hello I like animals']
t = text.split('. ')
#now lets loop for each value on our list
for v in t:
#check if the first 5 letters are "Hello"
if v[0:5] == 'Hello':
#if it is - remove the value from the list.
t.remove(v)
#now we have list of filtered strings:
#t
notice that the word 'Hello' may UPPER/LOWER case, so if you want to cover it all, add at the if:请注意,“Hello”这个词可能是大写/小写,所以如果你想涵盖所有内容,请在 if 处添加:
if v[0:5].casefold() == 'hello':
It refers to the string as lowercase.它将字符串称为小写。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.