正则表达式匹配某些字符但不包含开头的句点

Question

我有一个有一些空格的字符串。 我想用句号替换它们，但不是用句号结束的句号。

例如。

text = "This is the oldest European-settled town in the continental " \
   "U.S.\r\nExplore the town at your leisure\r\nUpgrade to add a " \
   "scenic cruise aboard \r\n"

我试图通过使用正则表达式将其更改为下面。

text = "This is the oldest European-settled town in the continental " \
   "U.S. Explore the town at your leisure. Upgrade to add" \
   " a scenic cruise aboard."

我现在拥有的是：

new_text = re.sub("(( )?(\\n|\\r\\n)+)", ". ", text).strip()

但是，它没有照顾句子以句号结束。 我应该在这里使用一些外观以及如何使用？

提前致谢！！

Answer 1

你可以添加“。” 在正则表达式中： (( )?\\.?(\\\\n|\\\\r\\\\n)+) 。 如果有“。” 它也将被替换为“。”

Answer 2

好吧，我不确定你的意思是\\r\\n是否是文字，所以...

文字：

>>> import re
>>> text = r"This is the oldest European-settled town in the continental U.S.\r\nExplore the town at your leisure\r\nUpgrade to add a scenic cruise aboard \r\n"
>>> result = re.sub(r'[ .]*(?:(?:\\r)?\\n)+', '. ', text).strip()
>>> print(result)
This is the oldest European-settled town in the continental U.S. Explore the town at your leisure. Upgrade to add a scenic cruise aboard.

ideone演示。

不是文字的：

>>> import re
>>> text = "This is the oldest European-settled town in the continental U.S.\r\nExplore the town at your leisure\r\nUpgrade to add a scenic cruise aboard \r\n"
>>> result = re.sub(r'[ .]*(?:\r?\n)+', '. ', text).strip()
>>> print(result)
This is the oldest European-settled town in the continental U.S. Explore the town at your leisure. Upgrade to add a scenic cruise aboard.

ideone演示

我删除了一些不必要的组，并将其他组转换为非捕获组。

我也把(\\\\n|\\\\r\\\\n)+)变成了一个稍微高效的形式(?:(?:\\\\r)?\\\\n)+)

Answer 3

如果您只是想摆脱新线路，请使用此功能

text = "This is the oldest European-settled town in the continental U.S.\r\nExplore the town at your leisure\r\nUpgrade to add a scenic cruise aboard \r\n"
text = text.replace('\r\n','')

正则表达式匹配某些字符但不包含开头的句点

问题描述

3 个解决方案

解决方案1
2 2014-01-29 16:50:15

解决方案2
1 已采纳 2014-01-29 17:01:08

解决方案3
0 2014-01-29 16:45:17

正则表达式匹配某些字符但不包含开头的句点

问题描述

3 个解决方案

解决方案1 2 2014-01-29 16:50:15

解决方案2 1 已采纳 2014-01-29 17:01:08

解决方案3 0 2014-01-29 16:45:17

解决方案1
2 2014-01-29 16:50:15

解决方案2
1 已采纳 2014-01-29 17:01:08

解决方案3
0 2014-01-29 16:45:17