[英]How to replace all dollar signs before all variables inside a double-quoted string with sed?
I have problems replacing the variables that are inside strings in bash. 我在替换bash字符串中的变量时遇到问题。 For example, I want to replace
例如,我要替换
"test$FOO1=$FOO2" $BAR
with: 有:
"test" .. FOO1 .. "=" .. FOO2 .. "" $BAR
I tried: 我试过了:
sed 's/\$\([A-Z0-9_]\+\)\b/" .. \1 .. "/g'
But I don't want to replace variables the same way outside of double-quoted strings, eg like: 但是我不想用双引号之外的字符串替换变量,例如:
if [ $VARIABLE = 1 ]; then
Has to be replaced by just 必须替换为
if VARIABLE then
Is there a way to replace only inside of double-quotes? 有没有办法只在双引号内替换?
Background: 背景:
I want to convert a bash script into Lua script . 我想将bash脚本转换为Lua脚本 。
I am aware, that it will not be easily possible to convert all possible shell scripts this way, but what I want to achieve is to replace all basic language constructs with Lua commands and replace all variables and conditionals. 我知道,要以这种方式转换所有可能的shell脚本不是一件容易的事,但是我要实现的是用Lua命令替换所有基本语言结构,并替换所有变量和条件。 An automation here will save much work when translating bash into Lua by hand
当手动将bash转换为Lua时,此处的自动化将节省大量工作
This with GNU awk for multi-char RS, RT, and gensub() shows one way to separate and then manipulate quoted (in RT) and unquoted (in $0) strings as a starting point: 这与用于多字符RS,RT和gensub()的GNU awk一起,展示了一种分离并随后处理带引号(在RT中)和非引号(在$ 0中)字符串的方法,以此作为起点:
$ cat tst.awk
BEGIN { RS="\"[^\"]*\""; ORS="" }
{
$0 = gensub(/\[\s+[$]([[:alnum:]_]+)\s+=\s+\S+\s+];/,"\\1","g",$0)
RT = gensub(/[$]([[:alnum:]_]+)"/,"\" .. \\1","g",RT)
RT = gensub(/[$]([[:alnum:]_]+)/,"\" .. \\1 .. \"","g",RT)
print $0 RT
}
$ awk -f tst.awk file
"count: " .. FOO .. " times " .. BAR
if VARIABLE then
The above was run on this input file: 上面是在此输入文件上运行的:
$ cat file
"count: $FOO times $BAR"
if [ $VARIABLE = 1 ]; then
NOTE: this approach of matching strings with regexps will always just be a best effort based on the samples provided, you'd need a shell language parser to do the job robustly. 注意:根据提供的示例,这种将字符串与正则表达式匹配的方法永远只是尽力而为,您需要使用Shell语言解析器来可靠地完成此工作。
This might work for you (GNU sed): 这可能对您有用(GNU sed):
sed -E ':a;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([^" ]*) /\1" .. \3 .. " /;ta;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([^"]*)"/\1" .. \3/;ta' file
When changing things within double quotes, first we must sail passed any double quoted strings that do not need changing. 当改变双引号内的事情,首先我们必须扬帆通过任何双引号字符串, 不需要改变。 This means anchoring the regexp to the start of the line using the
^
metacharacter and iterating the regexp until all cases cease to exist. 这意味着使用
^
元字符将正则表达式锚定到行的开头,并迭代正则表达式直到所有情况都不再存在。
First, eliminate zero or more characters which are not double quotes from the start of the line. 首先,从行的开头消除零个或多个不是双引号的字符。
Second, eliminate double quoted strings which do not contain the character of interest (TCOI) ie $
, followed by zero or more characters which are not double quotes, zero or more times. 其次,消除不包含感兴趣字符(TCOI)即
$
双引号字符串,后跟零个或多个不是双引号的字符,零次或多次。
Third, eliminate double quotes followed by zero or more characters which are not double quotes or TCOI ie $
. 第三,消除双引号,后跟零个或多个不是双引号或TCOI的字符,即
$
。
The following character (if it exists) must be TCOI. 以下字符(如果存在)必须为TCOI。 Group the entire collection of strings before in a back reference
\\1
. 在反向引用
\\1
之前对整个字符串集合进行分组。
Following TCOI, one or more conditions may be grouped. 在TCOI之后,可以对一个或多个条件进行分组。 In the above example the first condition is when a variable (beginning with TCOI) is followed by a space.
在上面的示例中,第一个条件是变量(以TCOI开头)后跟空格。 The second condition is when the variable is followed directly by
"
. Hence this entails two substitution commands, the ta
command, branches to the loop identified a
when the substitution was successful. 第二个条件是变量直接跟在
"
。因此,这需要两个替换命令,即ta
命令,分支到替换成功时标识为a
的循环。
NB The if [ $VARIABLE = 1 ]; then
注意:
if [ $VARIABLE = 1 ]; then
if [ $VARIABLE = 1 ]; then
situation can be treated in the same vien, here the [
is the opening double quote and the ]
is the closing double quote. if [ $VARIABLE = 1 ]; then
情况可以用同样的方式处理,这里[
是开头的双引号,而]
是结尾的双引号。
PS TCOI was $
and this is also a metacharacter in regexp that represents the end of a line, it therefore must be quoted eg \\$
PS TCOI是
$
,这也是regexp中的一个元字符,代表一行的末尾,因此必须用\\$
PPS Don't forget to quote the [
's and ]
's too. PPS不要忘了也引用
[
和]
。 If quotings not your thing, then enclose the character in [x]
where x is the character to be quoted. 如果引用不是您的事情,则将字符括在
[x]
,其中x是要引用的字符。
EDIT: 编辑:
sed -E ':a;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([[:alnum:]]*)/\1" .. \3 .. "/;ta' file
Since the original example has been replace by the OP here is a solution based on the new example. 由于原始示例已被OP取代,因此这里是基于新示例的解决方案。
I'm so sorry: I just post this answer to warn you about a wrong way! 很抱歉:我只是发布此答案以警告您错误的方法!
Reading language is a job for a consistant lexer not for sed nor any regex based tool!!! 对于一致的词法分析人员来说,阅读语言是一项工作,而不是sed或任何基于正则表达式的工具!
See GNU Bison , Berkeley Yacc (byacc) . 参见GNU Bison , Berkeley Yacc(byacc) 。
You could have a look at bash 's sources in order to see how scripts are read! 您可以查看bash的源代码,以了解如何读取脚本!
Persisting in this way will bring you quickly to big script, then further to unsolvable problems. 以这种方式坚持下去将使您快速进入大型脚本,然后进一步解决无法解决的问题。
using group and recursive 使用组和递归
sed -e ':a' -e 's/^\(\([^"]*\("[^"]*"\)*\)*\)\("[^$"]*\)[$]\([A-Z0-9_]\{1,\}\)/\1\4 .. \5 .. /;t a'
^\\(\\([^"]*\\("[^"]*"\\)*\\)*\\)
in group 1 ^\\(\\([^"]*\\("[^"]*"\\)*\\)*\\)
将字符串与上一部分隔离 s\\("[^$"]*\\)[$]\\([A-Z0-9_]\\{1,\\}\\)'
in group 4 (prefix) and 5 (var name) s\\("[^$"]*\\)[$]\\([A-Z0-9_]\\{1,\\}\\)'
隔离的字符串中选择var内容(变量名) \\1\\4 .. \\5 ..
\\1\\4 .. \\5 ..
:a
and ta
:a
和ta
with a gnu sed you can reduce the command to (no -e
needed to target the label a): 使用gnu sed可以将命令减少为(不需要
-e
来定位标签a):
sed ':a;s/^\(\([^"]*\("[^"]*"\)*\)*\)\("[^$"]*\)[$]\([A-Z0-9_]\{1,\}\)/\1\4 .. \5 .. /;t a'
Assuming there is no quote (escaped one) in string. 假设字符串中没有引号(转义的一个)。 If so a first pass is needed to change them and put them back after main modification.
如果是这样,则需要先通过更改它们,然后在进行主要修改后放回它们。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.