I have problems replacing the variables that are inside strings in bash. For example, I want to replace
"test$FOO1=$FOO2" $BAR
with:
"test" .. FOO1 .. "=" .. FOO2 .. "" $BAR
I tried:
sed 's/\$\([A-Z0-9_]\+\)\b/" .. \1 .. "/g'
But I don't want to replace variables the same way outside of double-quoted strings, eg like:
if [ $VARIABLE = 1 ]; then
Has to be replaced by just
if VARIABLE then
Is there a way to replace only inside of double-quotes?
Background:
I want to convert a bash script into Lua script .
I am aware, that it will not be easily possible to convert all possible shell scripts this way, but what I want to achieve is to replace all basic language constructs with Lua commands and replace all variables and conditionals. An automation here will save much work when translating bash into Lua by hand
This with GNU awk for multi-char RS, RT, and gensub() shows one way to separate and then manipulate quoted (in RT) and unquoted (in $0) strings as a starting point:
$ cat tst.awk
BEGIN { RS="\"[^\"]*\""; ORS="" }
{
$0 = gensub(/\[\s+[$]([[:alnum:]_]+)\s+=\s+\S+\s+];/,"\\1","g",$0)
RT = gensub(/[$]([[:alnum:]_]+)"/,"\" .. \\1","g",RT)
RT = gensub(/[$]([[:alnum:]_]+)/,"\" .. \\1 .. \"","g",RT)
print $0 RT
}
$ awk -f tst.awk file
"count: " .. FOO .. " times " .. BAR
if VARIABLE then
The above was run on this input file:
$ cat file
"count: $FOO times $BAR"
if [ $VARIABLE = 1 ]; then
NOTE: this approach of matching strings with regexps will always just be a best effort based on the samples provided, you'd need a shell language parser to do the job robustly.
This might work for you (GNU sed):
sed -E ':a;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([^" ]*) /\1" .. \3 .. " /;ta;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([^"]*)"/\1" .. \3/;ta' file
When changing things within double quotes, first we must sail passed any double quoted strings that do not need changing. This means anchoring the regexp to the start of the line using the ^
metacharacter and iterating the regexp until all cases cease to exist.
First, eliminate zero or more characters which are not double quotes from the start of the line.
Second, eliminate double quoted strings which do not contain the character of interest (TCOI) ie $
, followed by zero or more characters which are not double quotes, zero or more times.
Third, eliminate double quotes followed by zero or more characters which are not double quotes or TCOI ie $
.
The following character (if it exists) must be TCOI. Group the entire collection of strings before in a back reference \\1
.
Following TCOI, one or more conditions may be grouped. In the above example the first condition is when a variable (beginning with TCOI) is followed by a space. The second condition is when the variable is followed directly by "
. Hence this entails two substitution commands, the ta
command, branches to the loop identified a
when the substitution was successful.
NB The if [ $VARIABLE = 1 ]; then
if [ $VARIABLE = 1 ]; then
situation can be treated in the same vien, here the [
is the opening double quote and the ]
is the closing double quote.
PS TCOI was $
and this is also a metacharacter in regexp that represents the end of a line, it therefore must be quoted eg \\$
PPS Don't forget to quote the [
's and ]
's too. If quotings not your thing, then enclose the character in [x]
where x is the character to be quoted.
EDIT:
sed -E ':a;s/^([^"]*("[^"$]*"[^"]*)*"[^"$]*)\$([[:alnum:]]*)/\1" .. \3 .. "/;ta' file
Since the original example has been replace by the OP here is a solution based on the new example.
I'm so sorry: I just post this answer to warn you about a wrong way!
Reading language is a job for a consistant lexer not for sed nor any regex based tool!!!
See GNU Bison , Berkeley Yacc (byacc) .
You could have a look at bash 's sources in order to see how scripts are read!
Persisting in this way will bring you quickly to big script, then further to unsolvable problems.
using group and recursive
sed -e ':a' -e 's/^\(\([^"]*\("[^"]*"\)*\)*\)\("[^$"]*\)[$]\([A-Z0-9_]\{1,\}\)/\1\4 .. \5 .. /;t a'
^\\(\\([^"]*\\("[^"]*"\\)*\\)*\\)
in group 1 s\\("[^$"]*\\)[$]\\([A-Z0-9_]\\{1,\\}\\)'
in group 4 (prefix) and 5 (var name) \\1\\4 .. \\5 ..
:a
and ta
with a gnu sed you can reduce the command to (no -e
needed to target the label a):
sed ':a;s/^\(\([^"]*\("[^"]*"\)*\)*\)\("[^$"]*\)[$]\([A-Z0-9_]\{1,\}\)/\1\4 .. \5 .. /;t a'
Assuming there is no quote (escaped one) in string. If so a first pass is needed to change them and put them back after main modification.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.