繁体   English   中英

Python RegEx删除新行(不应存在)

[英]Python RegEx remove new lines (that shouldn't be there)

我提取了一些文本,希望通过RegEx进行清理。

我已经学习了基本的RegEx,但是不确定如何构建此RegEx:

str = '''
this is 
a line that has been cut.
This is a line that should start on a new line
'''

应该转换为:

str = '''
this is a line that has been cut.
This is a line that should start on a new line
'''

这个r'\\w\\n\\w'似乎抓住了它,但不确定如何用空格替换新行并且不触摸单词的结尾和开头

您可以将此正则表达式后面的代码用于re.sub

>>> str = '''
... this is
... a line that has been cut.
... This is a line that should start on a new line
... '''
>>> print re.sub(r'(?<!\.)\n', '', str)
this is a line that has been cut.
This is a line that should start on a new line
>>>

正则演示

(?<!\\.)\\n匹配所有不带点号的换行符。

如果您不希望基于点的存在进行匹配,请使用:

re.sub(r'(?<=\w\s)\n', '', str)

RegEx演示2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM