简体   繁体   English

bash字符串引用了多字args到数组

[英]bash string quoted multi-word args to array

The Question: 问题:

In bash scripting, what is the best way to convert a string, containing literal quotes surrounding multiple words, into an array with the same result of parsed arguments? 在bash脚本编写中,将包含多个单词的文字引号的字符串转换为具有相同解析参数结果的数组的最佳方法是什么?

The Controversy: 争议:

Many questions exist all applying evasive tactics to avoid the problem instead of finding a solution, this question raises the following arguments and would like to encourage the reader to focus on arguments and if you are up for it, partake in the challenge to find the optimum solution. 存在许多问题,都是采用规避策略来避免问题而不是找到解决方案,这个问题提出了以下论点,并希望鼓励读者关注论点,如果你愿意接受它,就要参与挑战,找到最佳选择。解。

Arguments raised: 提出的论点:

  1. Although there are many scenarios where this pattern should be avoided, because there exists alternative solutions better suited, the author is of the opinion that valid use cases still remain. 虽然有许多情况应该避免这种模式,因为存在更适合的替代解决方案,但作者认为仍然存在有效的用例。 This question will attempt to produce one such use case, but make no claim to the viability thereof only that it is a conceivable scenario which may present itself in a real world situation. 这个问题将尝试产生一个这样的用例,但不要仅仅认为它是可以想象的场景,它可能出现在现实世界的情况中。
  2. You must find the optimum solution to satisfy the requirement. 您必须找到满足要求的最佳解决方案。 The use case was chosen specifically for its real world applications. 该用例是专门针对其实际应用而选择的。 You may not agree with the decisions that were made but are not tasked to give an opinion only to deliver the solution. 您可能不同意所做出的决定,但并不负责仅提供解决方案的意见。
  3. Satisfy the requirement without modifying the input or choice of transport. 在不修改输入或选择运输的情况下满足要求。 Both specifically chosen with a real world scenario to defend the narrative that those parts are out of your control. 两者都是通过现实场景特别选择来捍卫这些部分不受你控制的叙述。
  4. No answers exist to the particular problem and this question aims to address that. 特定问题没有答案,这个问题旨在解决这个问题。 If you are inclined to avoid this pattern then simply avoid the question but if you think you are up for the challenge lets see how you would approach the problem. 如果您倾向于避免这种模式,那么只需避免这个问题,但如果您认为自己已经接受了挑战,那么请看看您将如何解决问题。

The Valid use case: 有效用例:

Converting an existing script currently in use to receive parameters via named pipe or similar stream. 转换当前正在使用的现有脚本,以通过命名管道或类似流接收参数。 In order to minimize the impact on the myriad of scripts outside of the developers control a decision was made to not change the interface. 为了最大限度地减少对开发人员控制之外的无数脚本的影响,决定不更改接口。 Existing scripts must be able to pass the same arguments via the new stream implementation as they did before. 现有脚本必须能够像以前一样通过新流实现传递相同的参数。

Existing implementation: 现有实施:

$ ./string2array arg1 arg2 arg3
args=(
    [0]="arg1"
    [1]="arg2"
    [2]="arg3"
)

Required change: 要求的变更:

$ echo "arg1 arg2 arg3" | ./string2array
args=(
    [0]="arg1"
    [1]="arg2"
    [2]="arg3"
)

The problem: 问题:

As pointed out by Bash and Double-Quotes passing to argv literal quotes are not parsed as would be expected. 正如Bash和Double-Quotes所指出的那样, 传递给argv的文字引号不会像预期的那样被解析。

This workbench script can be used to test various solutions, it handles the transport and formulates a measurable response. 此工作台脚本可用于测试各种解决方案,它处理传输并制定可测量的响应。 It is suggested that you focus on the solution script which gets sourced with the string as argument and you should populate the $args variable as an array. 建议您专注于使用字符串作为参数获取的解决方案脚本,并且应该将$ args变量填充为数组。

The string2array workbench script: string2array工作台脚本:

#!/usr/bin/env bash
#string2arry

args=()

function inspect() {
  local inspct=$(declare -p args)
  inspct=${inspct//\[/\\n\\t[}; inspct=${inspct//\'/}; inspct="${inspct:0:-1}\n)"
  echo -e ${inspct#*-a }
}

while read -r; do
  # source the solution to turn $REPLY in $args array
  source $1 "${REPLY}"
  inspect
done

Standard solution - FAILS 标准解决方案 - FAILS

The solution for turning a string into a space delimited array of words worked for our first example above: 将字符串转换为空格分隔的单词数组的解决方案适用于上面的第一个示例:

#solution1

args=($@)

Undesired result 不受欢迎的结果

Unfortunately the standard solution produces an undesired result for quoted multi word arguments: 遗憾的是,标准解决方案会对引用的多字参数产生不希望的结果:

$ echo 'arg1 "multi arg 2" arg3' | ./string2array solution1
args=(
    [0]="arg1"
    [1]="\"multi"
    [2]="arg"
    [3]="2\""
    [4]="arg3"
)

The Challenge: 挑战:

Using the workbench script provide a solution snippet that will produce the following result for the arguments received. 使用工作台脚本提供一个解决方案片段,它将为收到的参数生成以下结果。

Desired result: 期望的结果:

$ echo 'arg1 "multi arg 2" arg3' | ./string2array solution-xyz
args=(
    [0]="arg1"
    [1]="multi arg 2"
    [2]="arg3"
)

The solution should be compatible with standard argument parsing in every way. 解决方案应该以各种方式与标准参数解析兼容。 The following unit test should pass for for the provided solution. 对于提供的解决方案,应通过以下单元测试。 If you can think of anything currently missing from the unit test please leave a comment and we can update it. 如果您能想到单元测试中目前缺少的任何内容,请发表评论,我们可以对其进行更新。

Unit test for the requirements 单元测试要求

Update: Test simplified and includes the Johnathan Leffer test 更新:简化测试并包括Johnathan Leffer测试

#!/usr/bin/env bash
#test_string2array
solution=$1
function test() {
  cmd="echo \"${1}\" | ./string2array $solution"
  echo "$ ${cmd}"
  echo ${1} | ./string2array $solution > /tmp/t
  cat /tmp/t
  echo -n "Result : "
  [[ $(cat /tmp/t|wc -l) -eq 7 ]] && echo "PASSED!" || echo "FAILED!"
}

echo 1. Testing single args
test 'arg1 arg2 arg3 arg4 arg5'
echo
echo 2. Testing multi args \" quoted
test 'arg1 "multi arg 2" arg3 "a r g 4" arg5'
echo
echo 3 Testing multi args \' quoted
test "arg1 'multi arg 2' arg3 'a r g 4' arg5"
echo
echo 4 Johnathan Leffer test
test "He said, \"Don't do that!\" but \"they didn't listen.\""

The declare built-in seems to do what you want; 内置的declare似乎做你想要的; in my test, it's your inspect function that doesn't seem work to properly test all inputs: 在我的测试中,这是你的inspect功能似乎无法正确测试所有输入:

# solution3
declare -a "args=($1)"

Then 然后

$ echo "arg1 'arg2a arg2b' arg3" | while read -r; do
>  source solution3 "${REPLY}"
>  for arg in "${args[@]}"; do
>   echo "Arg $((++i)): $arg"
>  done
> done
Arg 1: arg1
Arg 2: arg2a arg2b
Arg 3: arg3

所以我认为xargs实际上适用于所有测试用例,例如:

echo 'arg1 "multi arg 2" arg3' | xargs -0 ./string2array

You may do it with declare instead of eval , for example: 您可以使用declare而不是eval来执行此操作,例如:

Instead of: 代替:

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
echo "Initial string: $string"
eval 'for word in '$string'; do echo $word; done'

Do: 做:

declare -a "array=($string)"
for item in "${array[@]}"; do echo "[$item]"; done

But please note, it is not much safer if input comes from user! 但是请注意,如果输入来自用户,它不是更安全!

So, if you try it with say string like: 所以,如果你尝试使用如下字符串:

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" `hostname`'

You get hostname evaluated (there off course may be something like rm -rf / )! 你得到hostname评估(当然可能有rm -rf / )!

Very-very simple attempt to guard it just replace chars like backtrick ` and $: 非常非常简单的保护它的尝试只是替换像backtrick`和$这样的字符:

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" `hostname`'
declare -a "array=( $(echo $string | tr '`$<>' '????') )"
for item in "${array[@]}"; do echo "[$item]"; done

Now you got output like: 现在你输出如下:

[aString that may haveSpaces IN IT]
[bar]
[foo]
[bamboo]
[bam boo]
[?hostname?]

More details about methods and pros about using different methods you may found in that good answer: Why should eval be avoided in Bash, and what should I use instead? 关于使用不同方法的方法和优点的更多细节,你可以在那个好的答案中找到: 为什么要在Bash中避免使用eval,我应该使用什么呢?

See also https://superuser.com/questions/1066455/how-to-split-a-string-with-quotes-like-command-arguments-in-bash/1186997#1186997 另见https://superuser.com/questions/1066455/how-to-split-a-string-with-quotes-like-command-arguments-in-bash/1186997#1186997

But there still leaved vector for attack. 但仍然留下了攻击的载体。 I very would have in bash method of string quote like in double quotes (") but without interpreting content . 我会在字符串引用的bash方法中使用双引号(“)而不解释内容

First attempt 第一次尝试

Populate a variable with the combined words once the open quote was detected and only append to the array once the close quote arrives. 一旦检测到打开的引用,就用组合的单词填充变量,并且只有在关闭的报价到达时才附加到数组。

Solution

#solution2
j=''
for a in ${1}; do
  if [ -n "$j" ]; then
    [[ $a =~ ^(.*)[\"\']$ ]] && {
      args+=("$j ${BASH_REMATCH[1]}")
      j=''
    } || j+=" $a"
  elif [[ $a =~ ^[\"\'](.*)$ ]]; then
    j=${BASH_REMATCH[1]}
  else
    args+=($a)
  fi
done

Unit test results: 单元测试结果:

$ ./test_string2array solution2
1. Testing single args
$ echo "arg1 arg2 arg3 arg4 arg5" | ./string2array solution2
args=(
    [0]="arg1"
    [1]="arg2"
    [2]="arg3"
    [3]="arg4"
    [4]="arg5"
)
Result : PASSED!

2. Testing multi args " quoted
$ echo 'arg1 "multi arg 2" arg3 "a r g 4" arg5' | ./string2array solution2
args=(
    [0]="arg1"
    [1]="multi arg 2"
    [2]="arg3"
    [3]="a r g 4"
    [4]="arg5"
)
Result : PASSED!

3 Testing multi args ' quoted
$ echo "arg1 'multi arg 2' arg3 'a r g 4' arg5" | ./string2array solution2
args=(
    [0]="arg1"
    [1]="multi arg 2"
    [2]="arg3"
    [3]="a r g 4"
    [4]="arg5"
)
Result : PASSED!

Second attempt 第二次尝试

Append the element in place without the need for an additional variable. 将元素附加到位而无需其他变量。

#solution3
for i in $1; do
  [[ $i =~ ^[\"\'] ]] && args+=(' ')
  lst=$(( ${#args[@]}-1 ))
  [[ "${args[*]}" =~ [[:space:]]$ ]] && args[$lst]+="${i/[\"\']/} " ||  args+=($i)
  [[ $i =~ [\"\']$ ]] && args[$lst]=${args[$lst]:1:-1}
done

Modify the delimiter 修改分隔符

In this solution we turn the spaces into commas, remove the quotes and reset the spaces for the multi word arguments, to allow for the correct argument parsing. 在此解决方案中,我们将空格转换为逗号,删除引号并重置多字参数的空格,以允许正确的参数解析。

#solution4
s=${*//[[:space:]]/\l}
while [[ $s =~ [\"\']([^\"\']*)[\"\'] ]]; do
  s=${s/$BASH_REMATCH/${BASH_REMATCH[1]//\l/ }}
done
IFS=\l
args=(${s})

NEEDS WORK!! 需要工作!!

Modify in place 修改到位

Let bash convert the string to array and then loop through to fix it. 让bash将字符串转换为数组,然后循环以修复它。

args=($@) cnt=${#args[@]} idx=-1 chr=
for (( i=0; i<cnt; i++ )); do
  [[ $idx -lt 0 ]] && {
    [[ ${args[$i]:0:1} =~ [\'\"] ]] && \
       idx=$i chr=${args[$idx]:0:1} args[$idx]="${args[$idx]:1}"
    continue
  }
  args[$idx]+=" ${args[$i]}"
  unset args[$i]
  [[ ${args[$idx]: -1:1} == $chr ]] && args[$idx]=${args[$idx]:0:-1} idx=-1
done

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM