简体   繁体   中英

BASH escaping double quotes within single quotes

I'm trying to write a bash function that would escape all double quotes within single quotes, eg:

'I need to escape "these" quotes with backslashes'

would become

'I need to escape \"these\" quotes with backslashes'

My take on it was:

  1. Find pairs of single quotes in the input and extract them with grep
  2. Pipe into sed, escape double quotes
  3. Sed again the whole input and replace grep match with sedded match

I managed to get it working to the part of having correctly escaped quotes section, but replacing it in the whole input fails.

The script code copypaste:

# $1 - Full name, $2 - minified name
adjust_quotes ()
{
    SINGLE_QUOTES=`grep -Eo "'.*'" $2`
    ESCAPED_QUOTES=`echo $SINGLE_QUOTES | sed 's|"|\\\\"|g'`
    sed -r "s|'.*'|$ESCAPED_QUOTES|g" "$2" > "$2.escaped"
    mv "$2.escaped" $2
    echo "Quotes escaped within single quotes on $2"
}

Random additional questions:

  • In the console, escaping the quote with only two backslashes works, but when code is put in the script - I need four. I'd love to know
  • Could I modify this code into a loop to escape all pairs of single quotes, one after another until EOF?

Thanks!

PS I know this would probably be easier to do in eg. python, but I really need to keep it in bash.

Using BASH string replacement:

s='I need to escape "these" quotes with backslashes'
r="${s//\"/\\\"}"
echo "$r"
I need to escape \"these\" quotes with backslashes

Here's a pure bash solution, which does the transformation on stdin, printing to stdout. It reads the entire input into memory, so it won't work with really enormous files.

escape_enclosed_quotes() (
  IFS=\'
  read -d '' -r -a fields
  for ((i=1; i<${#fields[@]}; i+=2)); do
    fields[i]=${fields[i]//\"/\\\"}
  done
  printf %s "${fields[*]}"
)

I deliberately enclosed the body of the function in parentheses rather than braces, in order to force the body to run in a subshell. That limits the modification of IFS to the body, as well as implicitly making the variables used local.

The function uses the read builtin to read the entire input (since the line delimiter is set to NUL with -d '' ) into an array ( -a ) using a single quote as the field separator ( IFS=\\' ). The result is that the parts of the input surrounded with single quotes are in the odd positions of the array, so the function loops over the odd indices to do the substitution only for those fields. I use bash's find-and-replace syntax instead of deferring to an external utility like sed .

This being bash, there are a couple of gotchas:

  1. If the file contains a NUL, the rest of the file will be ignored.
  2. If the last line of the file does not end with a newline, and the last character of that line is a single quote, it will not be output.

Both of the above conditions are impossible in a portable text file, so it's probably OK. All the same, worth taking note.


The supplementary question: why are the extra backslashes needed in

ESCAPED_QUOTES=`echo $SINGLE_QUOTES | sed 's|"|\\\\"|g'`

Answer: It has nothing to do with that line being in a script. It has to do with your use of backticks ( ... ) for command substitution, and the idiosyncratic and often unpredictable handling of backslashes inside backticks. This syntax is deprecated. Do not use it. (Not even if you see someone else using it in some random example on the internet.) If you had used the recommended $(...) syntax for command substitution, it would have worked as expected:

ESCAPED_QUOTES=$(echo $SINGLE_QUOTES | sed 's|"|\\"|g')

(More information is in the Bash FAQ linked above.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM