[英]Use sed to replace patterns that are not at the start of end of lines
Let's say I have input: 假设我输入了:
/a/b/c/d/e/
/a/b/c/d/e
a/b/c/d/e/
a/b/c/d/e
I'd like to replace all /
that are not at the edges with +
so the output is: 我想用
+
代替不在边缘的所有/
,所以输出为:
/a+b+c+d+e/
/a+b+c+d+e
a+b+c+d+e/
a+b+c+d+e
I've tried this command: 我已经尝试过以下命令:
sed -e "s#\(.\)/\(.\)#\1+\2#g"
which is close but not quite: 这很接近但不完全是:
/a+b/c+d/e/
/a+b/c+d/e
a+b/c+d/e/
a+b/c+d/e
presumably because the \\(.\\)
overlap between successive /
characters. 大概是因为
\\(.\\)
在连续的/
字符之间重叠。
I don't believe sed has a null match operator for beginning or end of line. 我不认为sed在行首或行尾有空匹配运算符。 So, how is this done?
那么,这是怎么做的呢?
You can translate all slashes to +
and then replace + (at the beginning or at the end) with a slash: 您可以将所有斜杠转换为
+
,然后用斜杠替换+(在开头或结尾):
sed 'y/\//+/;s/^+\|+$/\//g;'
or if the OR operator isn't available: 或如果OR运算符不可用:
sed 'y/\//+/;s/^+/\//;s/+$/\//;'
better if you change the delimiter to avoid to escape all literal slashes: 如果更改定界符以避免转义所有文字斜杠,则更好:
sed 'y~/~+~;s~^+\|+$~/~g;'
or if the OR operator isn't available: 或如果OR运算符不可用:
sed 'y~/~+~;s~^+~/~;s~+$~/~;'
(where ^
is an anchor for the start of the line and $
for the end) (其中
^
是该行开头的锚点, $
是该行的结尾)
Other way: you can protect the slashes you want to preserve using a placeholder: 其他方式:您可以使用占位符保护要保留的斜杠:
sed 's~^/~{`%{~;s~/$~{`%{~;y~/~+~;s~{`%{~/~g;'
If you have perl
you can use lookarounds for this: 如果您有
perl
,则可以使用环视方法:
perl -pe 's~(?<!^)/(?!$)~+~g' file
Output: 输出:
/a+b+c+d+e/
/a+b+c+d+e
a+b+c+d+e/
a+b+c+d+e
Otherwise you can use this sed
with 2 substitutes: 否则,您可以将此
sed
与2个替代品一起使用:
sed -r 's~(.)/(.)~\1+\2~g; s~(.)/(.)~\1+\2~g' file
Or this sed with labeling and looping: 或者用标签和循环来实现:
sed -r ':a;s|(.)/(.)|\1+\2|g;ta' file
Here is a sed command that gives your output: 这是一个sed命令,可提供您的输出:
sed -r 's=(.)/\b=\1+=g;' file
/
is uses as separator for the s command, but here we use =
/
用作s命令的分隔符,但这里我们使用=
/
is matched where there is something ( .
) before it and and we are at a word boundary /
匹配在前面有( .
)且我们位于单词边界的地方 (.)/(.)
but that did not work: (.)/(.)
但是没有用:
x/y/<
the second match would only see /z
and not y/z
x/y/<
,第二个匹配项只会显示/z
而不是y/z
\\b
the first match does not consume the y
and the second match sees y/
\\b
,第一个匹配项不会消耗y
,第二个匹配项会看到y/
This is the common and extremely useful sed idiom for doing jobs like this: 这是完成以下工作的常见且极为有用的sed习惯用法:
$ sed 's:a:aA:g; s:^/\|/$:aB:g; s:/:+:g; s:aB:/:g; s:aA:a:g' file
/a+b+c+d+e/
/a+b+c+d+e
a+b+c+d+e/
a+b+c+d+e
The 1st sub changes all a
s to aA
. 第一个子将所有
a
更改为aA
。 At that point there is no letter a
in the input that is not followed by the letter A
(we need to do this first to ensure that after our 2nd sub the only aB
s in the input are as a result of that 2nd sub) 在这一点上,输入中没有字母
a
,后跟字母A
(我们需要首先执行此操作,以确保在第二个子之后,输入中仅有的aB
是该第二个子的结果)
The 2nd sub changes all /
s at the start or end of a line to aB
. 第二个子句将行的开头或结尾的全部
/
s更改为aB
。 At that point the only aB
s in the input are where there were originally /
s at the start or end of the line. 在那一点上,输入中唯一的
aB
是行的开始或结尾处最初存在/
s的位置。
The 3rd sub changes all remaining /
s (ie those that were not at the start or end of the line) to +
s. 第3个子项将所有剩余的
/
s(即不在行首或末尾的/
s)更改为+
s。
The 4th sub restores the aB
s back to the original front/end /
s. 的第四子恢复
aB
的背部到原来的前/结束/
秒。
The 5th sub restores the aA
s back to the original a
s. 第五个子将
aA
s恢复为原始a
s。
This might work for you (GNU sed): 这可能对您有用(GNU sed):
sed ':a;s/\([^\/]\)\/\([^\/]\)/\1+\2/g;ta' file
Or visually easier: 或在视觉上更容易:
sed -r ':a;s#([^/])/([^/])#\1+\2#g;ta' file
It is really the same regexp twice: 两次确实是相同的正则表达式:
sed 's/\([^\/]\)\/\([^\/]\)/\1+\2/g;s/\([^\/]\)\/\([^\/]\)/\1+\2/g' file
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.