简体   繁体   English

使用正则表达式删除关键短语 python 出现后的句子

[英]Using regex to remove sentence after occurrence of key phrase python

I am looking for a regex solution to remove any words in the rest of the sentence after the occurrence of a key phrase.我正在寻找一种正则表达式解决方案,以删除出现关键短语后句子的 rest 中的任何单词。

Example例子

sentence = "The weather forecast for today is mostly sunny. The forecast for tomorrow will be rainy. The rest of the week..."句子=“今天的天气预报大多是晴天。明天的预报会下雨。本周的rest......”

Key_phrase = "for tomorrow" Key_phrase = "明天"

Desired output = "The weather forecast for today is mostly sunny. The forecast. The rest of the week..." Desired output = "今天的天气预报大部分是晴天。预报。本周的 rest..."

Attempt试图

head, sep, tail = sentence.partition(key_phrase)
print(head)

My idea is to first split the string into sentences, apply the above technique and then join the results.我的想法是首先将字符串拆分成句子,应用上述技术,然后加入结果。 However, I feel like there must be a more elegant way to do this with regex?但是,我觉得必须有一种更优雅的方式来使用正则表达式来做到这一点?

Thanks for the help谢谢您的帮助

Using re.sub使用re.sub

Ex:前任:

sentence = "The weather forecast for today is mostly sunny. The forecast for tomorrow will be rainy. The rest of the week..."
key_phrase = "for tomorrow"
print(re.sub(fr"({key_phrase}.*?)(?=\.)", "", sentence))

Output Output

The weather forecast for today is mostly sunny. The forecast . The rest of the week...

Use利用

re.sub(fr"{re.escape(key_phrase)}[^.]*", "", sentence)

See regex proof .请参阅正则表达式证明

EXPLANATION解释

--------------------------------------------------------------------------------
  for                      'for'
--------------------------------------------------------------------------------
  \                        ' '
--------------------------------------------------------------------------------
  tomorrow                 'tomorrow'
--------------------------------------------------------------------------------
  [^.]*                    any character except: '.' (0 or more times
                           (matching the most amount possible))

See Python proof :参见Python 证明

import re
sentence = "The weather forecast for today is mostly sunny. The forecast for tomorrow will be rainy. The rest of the week..."
key_phrase = "for tomorrow"
print(re.sub(fr"{re.escape(key_phrase)}[^.]*", "", sentence))

Results : The weather forecast for today is mostly sunny. The forecast. The rest of the week...结果The weather forecast for today is mostly sunny. The forecast. The rest of the week... The weather forecast for today is mostly sunny. The forecast. The rest of the week...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM