简体   繁体   中英

Variable Manipulation not working as expected in macOS bash script

Given:

itemName='boo\boo\1\7\064.txt'

I want to convert the octals to printables while removing unprintables. The catch: I don't want to remove backslashed alphas like the \\b. The result should be:

newItemName='boo\boo4.txt'

I can't figure out why part of the sed statement doesn't work correctly:

newItemName="$(printf "%s" "$itemName" | sed -E 's/(\\[0-7]{1,3})/'"$(somevar="&";printf "${somevar:1}";)"'/g' | tr -dc '[:print:]')"

I used somevar="&"; instead of directly accessing & so I could use variable manipulation.

The search statement s/(\\[0-7]{1,3})/ works fine.

In the printf if I use $somevar or ${somevar:0} instead of ${somevar:1} I get the original string as expected (eg \\064). What doesn't work is the ${somevar:1} . These also don't work: ${somevar/\\/} or ${somevar//\\/} .

  1. What am I misunderstanding about how variable manipulation works?
  2. Is there an easier way to do this? I've searched and searched...

Sam; long time no see! The problem here is the order of evaluation. All of the shell expressions, including the $(somevar="&";printf "${somevar:1}";) , are evaluated before sed is even launched. As a result, somevar isn't the string matched by the regex, it's just a literal ampersand. That means ${somevar:1} is just the empty string, and you wind up just running sed -E 's/(\\\\[0-7]{1,3})//g' .

You need a way to take the matched string and run a calculation on it ( after it's been matched), and sed just isn't flexible enough to do this. But perl is. perl has an s operator, similar to sed's, but with the e option the replacement is executed as a perl expression rather than just a literal string. Give this a try:

newItemName="$(printf "%s\n" "$itemName" | perl -pe 's/\\([0-7]{1,3})/chr oct $1/eg' | tr -dc '[:print:]')"

What am I misunderstanding about how variable manipulation works?

I believe you are misunderstanding how sed works.

When & character is used inside the replacement string, it is replaced by the whole string matched. See this sed introduction .

Now about ${var:offset} parameter expansion :

somevar=&
printf "$somevar" 

would print & . Then:

printf "${somevar:1}"

would extract substring starting at offset 1 to the end of string. The first character is at offset, well, 0, so at at offset 1 there is no character, because out variable somevar has one character. So it will print nothing.

printf "${somevar:0}"

would print a substring starting at offset 0 to the end of the string. So the whole string. So ${somevar:0} is equal to $somevar . It will print & .

So:

$(somevar="&";printf "${somevar:1}";)

expands to nothing, because ${somevar:1} expands to nothing. So you sed command looks like this:

sed -E 's/(\\[0-7]{1,3})//g' 

The sed command substitutes a \\ character followed by a number 0-7 one to 3 times for nothing, multiple times. It does what you want.

Now if it would be ${somevar:0} then:

$(somevar="&";printf "${somevar:0}";)

expands to & , so your sed command would look like this:

sed -E 's/(\\[0-7]{1,3})/&/g' 

so it would substitute a \\\\[0-7]{1,3} for itself . Ie. it does nothing.

You could loose the -E option and (...) backreference, and just use posixly compatible sed :

sed 's/\\[0-7]\{1,3\}//g'

Is there an easier way to do this?

Your method looks fine. You could use ahere string instead of printf and you could strengthen the sed to match octal numbers better, depending on needs:

newItemName="$(
        <<<"$itemName" sed 's/\\\([0-3][0-7]\{0,2\}\|[0-7]\{1,2\}\)//g' |
        tr -dc '[:print:]'
)"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM