简体   繁体   English

奇怪的捕获/匹配 sed 行为

[英]Odd catch/match sed behavior

Why does GNU sed not catch as expected below in [1], while [2] and above all [3] are OK?为什么 GNU sed在下面的 [1] 中没有按预期捕获,而 [2] 和最重要的 [3] 没问题?

[1] $ echo "ca t" | sed -E 's/c([^ ]+) /\1/'  # Odd, expected 'a' or possibly 'a t'
    at
[2] $ echo "ca t" | sed -E 's/c([^ ]+)/\1/'   # OK, expected 'a t' 
    a t
[3] $ echo "ca t" | sed -E 's/ c([^ ]+)/\1/'  # OK, does not catch, but then why on earth is [1]?
    ca t

Try as I might to browse the GNU info on regexps for sed , I still cannot make sense of this output in [1].尽我所能浏览关于sed regexps 的 GNU info ,我仍然无法理解 [1] 中的这个输出。
The blank space at the right of the first part of the substitution regexp in [1] was expected to block expansion of the catch just at the space in between a and t , yielding a as a match.在替代的regexp的第一部分的右侧的空白空间[1]预计只是在AT之间的空间以阻挡捕获的扩张从而产生为匹配。
Supposing this rightmost blank space in the first-part regexp actually does not count for some reason unclear to me, I expected the match to behave as in [2], yielding at .假设第一部分正则表达式中最右边的空格实际上由于某种原因而不算数,我预计匹配会像 [2] 中那样,产生.
But then [3] shows that blank spaces do act as contextual filters for the match, as a leftmost blank space added to the first-part regexp blocks any match.但是 [3] 表明空格确实充当匹配的上下文过滤器,因为添加到第一部分正则表达式的最左边的空格会阻止任何匹配。
sed version is 4.7 under Ubuntu 20.04. sed版本在 Ubuntu 20.04 下是 4.7。
I sure must be missing something somewhere.我肯定在某个地方遗漏了一些东西。 Any idea?任何的想法?

In [1], the regex c([^ ]+) matches ca capturing a as \\1 , which means the substring ca is replaced with a and the pattern space will turn into at .在[1],正则表达式c([^ ]+)匹配ca捕获a作为\\1 ,这意味着子ca被替换为a与图案空间就会变成at

In [3], the regex c([^ ]+) does not match the pattern space due to the leading space and the pattern space is printed unmodified.在 [3] 中,正则表达式c([^ ]+)由于前导空格与模式空间不匹配,并且模式空间未修改地打印。

I hope my explanation is clear enough.我希望我的解释足够清楚。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM