简体   繁体   English

如何用sed替换双引号字符串中的所有变量之前的所有美元符号?

[英]How to replace all dollar signs before all variables inside a double-quoted string with sed?

I have problems replacing the variables that are inside strings in bash. 我在替换bash字符串中的变量时遇到问题。 For example, I want to replace 例如,我要替换

"test$FOO1=$FOO2" $BAR

with: 有:

"test" .. FOO1 .. "=" .. FOO2 .. "" $BAR

I tried: 我试过了:

sed 's/\$\([A-Z0-9_]\+\)\b/" .. \1 .. "/g'

But I don't want to replace variables the same way outside of double-quoted strings, eg like: 但是我不想用双引号之外的字符串替换变量,例如:

if [ $VARIABLE = 1 ]; then

Has to be replaced by just 必须替换为

if VARIABLE then

Is there a way to replace only inside of double-quotes? 有没有办法只双引号替换?

Background: 背景:
I want to convert a bash script into Lua script . 我想将bash脚本转换为Lua脚本

I am aware, that it will not be easily possible to convert all possible shell scripts this way, but what I want to achieve is to replace all basic language constructs with Lua commands and replace all variables and conditionals. 我知道,要以这种方式转换所有可能的shell脚本不是一件容易的事,但是我要实现的是用Lua命令替换所有基本语言结构,并替换所有变量和条件。 An automation here will save much work when translating bash into Lua by hand 当手动将bash转换为Lua时,此处的自动化将节省大量工作

This with GNU awk for multi-char RS, RT, and gensub() shows one way to separate and then manipulate quoted (in RT) and unquoted (in $0) strings as a starting point: 这与用于多字​​符RS,RT和gensub()的GNU awk一起,展示了一种分离并随后处理带引号(在RT中)和非引号(在$ 0中)字符串的方法,以此作为起点:

$ cat tst.awk
BEGIN { RS="\"[^\"]*\""; ORS="" }
{
    $0 = gensub(/\[\s+[$]([[:alnum:]_]+)\s+=\s+\S+\s+];/,"\\1","g",$0)
    RT = gensub(/[$]([[:alnum:]_]+)"/,"\" .. \\1","g",RT)
    RT = gensub(/[$]([[:alnum:]_]+)/,"\" .. \\1 .. \"","g",RT)
    print $0 RT
}

$ awk -f tst.awk file
"count: " .. FOO .. " times " .. BAR
if VARIABLE then

The above was run on this input file: 上面是在此输入文件上运行的:

$ cat file
"count: $FOO times $BAR"
if [ $VARIABLE = 1 ]; then

NOTE: this approach of matching strings with regexps will always just be a best effort based on the samples provided, you'd need a shell language parser to do the job robustly. 注意:根据提供的示例,这种将字符串与正则表达式匹配的方法永远只是尽力而为,您需要使用Shell语言解析器来可靠地完成此工作。

This might work for you (GNU sed): 这可能对您有用(GNU sed):

sed -E ':a;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([^" ]*) /\1" .. \3  .. " /;ta;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([^"]*)"/\1" .. \3/;ta' file

When changing things within double quotes, first we must sail passed any double quoted strings that do not need changing. 当改变双引号内的事情,首先我们必须扬帆通过任何双引号字符串, 不需要改变。 This means anchoring the regexp to the start of the line using the ^ metacharacter and iterating the regexp until all cases cease to exist. 这意味着使用^元字符将正则表达式锚定到行的开头,并迭代正则表达式直到所有情况都不再存在。

First, eliminate zero or more characters which are not double quotes from the start of the line. 首先,从行的开头消除零个或多个不是双引号的字符。

Second, eliminate double quoted strings which do not contain the character of interest (TCOI) ie $ , followed by zero or more characters which are not double quotes, zero or more times. 其次,消除不包含感兴趣字符(TCOI)即$双引号字符串,后跟零个或多个不是双引号的字符,零次或多次。

Third, eliminate double quotes followed by zero or more characters which are not double quotes or TCOI ie $ . 第三,消除双引号,后跟零个或多个不是双引号或TCOI的字符,即$

The following character (if it exists) must be TCOI. 以下字符(如果存在)必须为TCOI。 Group the entire collection of strings before in a back reference \\1 . 在反向引用\\1之前对整个字符串集合进行分组。

Following TCOI, one or more conditions may be grouped. 在TCOI之后,可以对一个或多个条件进行分组。 In the above example the first condition is when a variable (beginning with TCOI) is followed by a space. 在上面的示例中,第一个条件是变量(以TCOI开头)后跟空格。 The second condition is when the variable is followed directly by " . Hence this entails two substitution commands, the ta command, branches to the loop identified a when the substitution was successful. 第二个条件是变量直接跟在" 。因此,这需要两个替换命令,即ta命令,分支到替换成功时标识为a的循环。

NB The if [ $VARIABLE = 1 ]; then 注意: if [ $VARIABLE = 1 ]; then if [ $VARIABLE = 1 ]; then situation can be treated in the same vien, here the [ is the opening double quote and the ] is the closing double quote. if [ $VARIABLE = 1 ]; then情况可以用同样的方式处理,这里[是开头的双引号,而]是结尾的双引号。

PS TCOI was $ and this is also a metacharacter in regexp that represents the end of a line, it therefore must be quoted eg \\$ PS TCOI是$ ,这也是regexp中的一个元字符,代表一行的末尾,因此必须用\\$

PPS Don't forget to quote the [ 's and ] 's too. PPS不要忘了也引用[] If quotings not your thing, then enclose the character in [x] where x is the character to be quoted. 如果引用不是您的事情,则将字符括在[x] ,其中x是要引用的字符。

EDIT: 编辑:

sed -E ':a;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([[:alnum:]]*)/\1" .. \3  .. "/;ta' file

Since the original example has been replace by the OP here is a solution based on the new example. 由于原始示例已被OP取代,因此这里是基于新示例的解决方案。

bash lexer for shell!? bash lexer for shell !?

I'm so sorry: I just post this answer to warn you about a wrong way! 很抱歉:我只是发布此答案以警告您错误的方法!

Reading language is a job for a consistant lexer not for sed nor any regex based tool!!! 对于一致的词法分析人员来说,阅读语言是一项工作,而不是sed或任何基于正则表达式的工具!

See GNU Bison , Berkeley Yacc (byacc) . 参见GNU BisonBerkeley Yacc(byacc)

You could have a look at 's sources in order to see how scripts are read! 您可以查看的源代码,以了解如何读取脚本!

Persisting in this way will bring you quickly to big script, then further to unsolvable problems. 以这种方式坚持下去将使您快速进入大型脚本,然后进一步解决无法解决的问题。

using group and recursive 使用组和递归

sed -e ':a' -e 's/^\(\([^"]*\("[^"]*"\)*\)*\)\("[^$"]*\)[$]\([A-Z0-9_]\{1,\}\)/\1\4 .. \5 .. /;t a'
  1. isolate in string from previous part with ^\\(\\([^"]*\\("[^"]*"\\)*\\)*\\) in group 1 在第1组中使用^\\(\\([^"]*\\("[^"]*"\\)*\\)*\\)将字符串与上一部分隔离
  2. select the var content in the string isolated with s\\("[^$"]*\\)[$]\\([A-Z0-9_]\\{1,\\}\\)' in group 4 (prefix) and 5 (var name) 在第4组(前缀)和第5组中用s\\("[^$"]*\\)[$]\\([A-Z0-9_]\\{1,\\}\\)'隔离的字符串中选择var内容(变量名)
  3. change like you want with \\1\\4 .. \\5 .. 更改为\\1\\4 .. \\5 ..
  4. repeat this operation while a change is occuring :a and ta 发生更改时重复此操作:ata

with a gnu sed you can reduce the command to (no -e needed to target the label a): 使用gnu sed可以将命令减少为(不需要-e来定位标签a):

sed ':a;s/^\(\([^"]*\("[^"]*"\)*\)*\)\("[^$"]*\)[$]\([A-Z0-9_]\{1,\}\)/\1\4 .. \5 .. /;t a'

Assuming there is no quote (escaped one) in string. 假设字符串中没有引号(转义的一个)。 If so a first pass is needed to change them and put them back after main modification. 如果是这样,则需要先通过更改它们,然后在进行主要修改后放回它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM