简体   繁体   English

正则表达式替换python中的多个标点符号

[英]Regex replace multiple punctuation in python

I would like to find multiple occurrences of exclamation marks, question marks and periods (such as !!?! , ...? , ...! ) and replace them with just the final punctuation. 我想找到多次出现惊叹号,问号和句号(例如!!?!...?...! )并用最后的标点符号替换它们。

ie !?!?!? !?!?!? would become ? 会成为?

and ....! ....! would become ! 会成为!

Is this possible? 这可能吗?

text = re.sub(r'[\?\.\!]+(?=[\?\.\!])', '', text)

That is, remove any sequence of ?!. 也就是说,删除任何序列的?!. characters that are going to be followed by another ?!. 将被另一个人跟随的角色?!. character. 字符。

[...] is a character class. [...]是一个角色类。 It matches any character inside the brackets. 它匹配括号内的任何字符。

+ means "1 or more of these". +表示“这些中的一个或多个”。

(?=...) is a lookahead. (?=...)是一个先行者。 It looks to see what is going to come next in the string. 它看起来会看到字符串中接下来会发生什么。

text = re.search('[.?!]*([.?!])', text).group(1)

这种方式的工作方式是括号创建一个捕获组 ,允许您通过group函数访问匹配的文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM