简体   繁体   English

在sed中执行由反向引用定义的命令

[英]Execute command defined by backreference in sed

I am creating a primitive experimental templating engine completely based on sed (merely for my private enjoyment). 我正在创建一个完全基于sed的原始实验模板引擎(仅出于个人目的)。 One thing I have been trying to achieve for several hours now is to replace certain text patterns with the output of a command they contain. 我已经尝试了好几个小时了,现在要做的一件事是用它们包含的命令输出替换某些文本模式。

To clearify, if an input line looks like this 为了澄清,如果输入行看起来像这样

Lorem {{echo ipsum}}

I would look the sed output to look like this: 我将看到sed输出看起来像这样:

Lorem ipsum

The closest I have come is this: 我最接近的是:

echo 'Lorem {{echo ipsum}}' | sed 's/{{\(.*\)}}/'"$(\\1)"'/g'

which does not work. 这不起作用。

However, 然而,

echo 'Lorem {{echo ipsum}}' | sed 's/{{\(.*\)}}/'"$(echo \\1)"'/g'

gives me 给我

Lorem echo ipsum

I don't quite understand what is happening here. 我不太了解这里发生了什么。 Why can I give the backreference to the echo command, but cannot evaluate the entire backreference in $()? 为什么我可以将后向引用赋予echo命令,但不能在$()中评估整个后向引用? When is \\\\1 getting evaluated? \\\\ 1什么时候得到评估? Is the thing I am trying to achieve even possible with pure sed? 我尝试使用纯sed甚至可以实现的目标吗?

Keep in mind that it is entirely clear to me that what I am trying to achieve is easily possible with other tools. 请记住,对我来说,很清楚,使用其他工具可以轻松实现我要实现的目标。 However, I am highly interested in whether this is possible with pure sed. 但是,我对纯sed是否可以实现这一点非常感兴趣。

Thanks! 谢谢!

The reason your attempt doesn't work is that $() is expanded by the shell before sed is even called. 您的尝试不起作用的原因是$()在调用sed之前已被外壳扩展。 For this reason it can't use the backreferences sed is eventually going to capture. 因此,它不能使用sed最终将要捕获的反向引用。

It is possible to do this sort of thing with GNU sed (not with POSIX sed). 使用GNU sed(而不是POSIX sed)可以做这种事情。 The main trick is that GNU sed has a e flag to the s command that makes it replace the pattern space (the whole space) with the result of the pattern space executed as a shell command. 主要技巧是GNU sed在s命令中带有一个e标志,使它用作为shell命令执行的模式空间的结果替换模式空间(整个空间)。 What this means is that 这意味着

echo 'echo foo' | sed 's/f/g/e'

prints goo . 打印goo

This can be used for your use case as follows: 可以将其用于您的用例,如下所示:

echo 'Lorem {{echo ipsum}}' | sed ':a /\(.*\){{\(.*\)}}\(.*\)/ { h; s//\1\n\3/; x; s//\2/e; G; s/\(.*\)\n\(.*\)\n\(.*\)/\2\1\3/; ba }'

The sed code works as follows: sed代码的工作方式如下:

:a                                    # jump label for looping, in case there are
                                      # several {{}} expressions in a line
/\(.*\){{\(.*\)}}\(.*\)/ {            # if there is a {{}} expression,
  h                                   # make a copy of the line
  s//\1\n\3/                          # isolate the surrounding parts
  x                                   # swap the original back in
  s//\2/e                             # isolate the command, execute, get output
  G                                   # get the outer parts we put into the hold
                                      # buffer
  s/\(.*\)\n\(.*\)\n\(.*\)/\2\1\3/    # rearrange the parts to put the command
                                      # output into the right place
  ba                                  # rinse, repeat until all {{}} are covered
}

This makes use of sed 's greedy matching in the regexes to always capture the last {{}} expression in a line. 这利用了正则表达式中sed的贪婪匹配来始终捕获一行中的最后一个{{}}表达式。 Note that it will have difficulties if there are several commands in a line and one of the later ones has multi-line output. 请注意,如果一行中有多个命令,而后一个命令具有多行输出,则会遇到困难。 Handling this case will require the definition of a marker that the commands embedded in the data are not allowed to have as part of their output and that the templates are not allowed to contain. 处理这种情况将需要定义一个标记,即不允许嵌入在数据中的命令作为其输出的一部分,并且不允许包含模板。 I would suggest something like {{{}}} , which would lead to 我建议使用类似{{{}}} ,这将导致

sed ':a /\(.*\){{\(.*\)}}\(.*\)/ { h; s//{{{}}}\1{{{}}}\3/; x; s//\2/e; G; s/\(.*\)\n{{{}}}\(.*\){{{}}}\(.*\)/\2\1\3/; ba }'

The reasoning behind this is that the template engine would run into trouble anyway if the embedded commands printed further {{}} terms. 其背后的原因是,如果嵌入式命令进一步打印{{}}项,则模板引擎无论如何都会遇到麻烦。 This convention is impossible to enforce, but then any code you pass into this template engine had better come from a trusted source, anyway. 不能强制执行此约定,但是无论如何,传递给此模板引擎的任何代码最好都来自受信任的来源。

Mind you, I am not sure that this whole thing is a sane idea 1 . 请注意,我不确定这整个事情是否是理智的想法1 You're not planning to use it in any sort of production code, are you? 您不打算在任何生产代码中使用它,对吗?

1 I am, however, quite sure whether it is a sane idea. 1但是,我很确定这是否是一个理智的想法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM