简体   繁体   English

如果使用 sed 和带有扩展语法的正则表达式的特定模式的一部分,则用新行替换空格

[英]Replace spaces with new lines if part of a specific pattern using sed and regex with extended syntax

so I have a text file with multiple instances looking like this:所以我有一个包含多个实例的文本文件,如下所示:

word. word or words [something:'else]

I need to replace with a new line the double space after every period followed by a sequence of words and then a "[", like so:我需要用新行替换每个句点后的双空格,然后是一系列单词,然后是“[”,如下所示:

word.\nword or words [something:'else]

I thought about using the sed command in bash with extended regex syntax, but nothing has worked so far... I've tried different variations of this:我考虑过在 bash 中使用 sed 命令和扩展的正则表达式语法,但到目前为止没有任何效果......我尝试了不同的变体:

sed -E 's/(\.)( )(.*)(.\[)/\1\n\3\4/g' old.txt > new.txt

I'm an absolute beginner at this, so I'm not sure at all about what I'm doing我是这方面的绝对初学者,所以我完全不确定自己在做什么

This might work for you (GNU sed):这可能对您有用(GNU sed):

sed -E 's/\.  ((\w+ )+\[)/\.\n\1/g' file

Replace globally a period followed by two spaces and one or more words space separated followed by an opening square bracket by;全局替换一个句点,后跟两个空格和一个或多个单词,空格分隔,后跟一个方括号; a period followed by a newline followed by the matching back reference from the regexp.一个句点后跟一个换行符,然后是来自正则表达式的匹配反向引用。

Your sed command is almost correct (but contains some redundancies)您的sed命令几乎是正确的(但包含一些冗余)

sed -E 's/(\.)(  )(.*)(.\[)/\1\n\3\4/' old.txt > new.txt
#                                   ^
#                                   You forget terminating the s command

But you don't need to capture everything.但是您不需要捕获所有内容。 A simpler one could be一个更简单的可能是

sed -E 's/\.  (.*\[)/.\n\1/' old.txt > new.txt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM