[英]Replace spaces with new lines if part of a specific pattern using sed and regex with extended syntax
so I have a text file with multiple instances looking like this:所以我有一个包含多个实例的文本文件,如下所示:
word. word or words [something:'else]
I need to replace with a new line the double space after every period followed by a sequence of words and then a "[", like so:我需要用新行替换每个句点后的双空格,然后是一系列单词,然后是“[”,如下所示:
word.\nword or words [something:'else]
I thought about using the sed command in bash with extended regex syntax, but nothing has worked so far... I've tried different variations of this:我考虑过在 bash 中使用 sed 命令和扩展的正则表达式语法,但到目前为止没有任何效果......我尝试了不同的变体:
sed -E 's/(\.)( )(.*)(.\[)/\1\n\3\4/g' old.txt > new.txt
I'm an absolute beginner at this, so I'm not sure at all about what I'm doing我是这方面的绝对初学者,所以我完全不确定自己在做什么
This might work for you (GNU sed):这可能对您有用(GNU sed):
sed -E 's/\. ((\w+ )+\[)/\.\n\1/g' file
Replace globally a period followed by two spaces and one or more words space separated followed by an opening square bracket by;全局替换一个句点,后跟两个空格和一个或多个单词,空格分隔,后跟一个方括号; a period followed by a newline followed by the matching back reference from the regexp.一个句点后跟一个换行符,然后是来自正则表达式的匹配反向引用。
Your sed
command is almost correct (but contains some redundancies)您的sed
命令几乎是正确的(但包含一些冗余)
sed -E 's/(\.)( )(.*)(.\[)/\1\n\3\4/' old.txt > new.txt
# ^
# You forget terminating the s command
But you don't need to capture everything.但是您不需要捕获所有内容。 A simpler one could be一个更简单的可能是
sed -E 's/\. (.*\[)/.\n\1/' old.txt > new.txt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.