简体   繁体   English

读取将引用和未引用的内容混合到bash数组中的文件,保留引号

[英]Reading a file mixing quoted and unquoted content into a bash array, retaining quotes

My file looks like 我的文件看起来像

"dog" 23 "a description of the dog" 123 456 "21"
"cat"  5 "a description of the cat" 987 654 "22"

I'm loading the file line by line into an array 我正在逐行将文件加载到数组中

filename=$1

while read -r line
do
   animal_array=($line)
  *do stuff
done < $filename

What I want to see: 我想看到的:

animal_array[1] --> "dog"
animal_array[2] --> 23
animal_array[3] --> "a description of the dog"
animal_array[4] --> 123
animal_array[5] --> 456
aninal_array[6] --> "21"

What I get: 我得到了什么:

animal_array[1] --> "dog"
animal_array[2] --> 23
animal_array[3] --> "a  
animal_array[4] --> description
animal_array[5] --> of
animal_array[6] --> the
animal_array[7] --> dog"
animal_array[8] --> 123
animal_array[9] --> "21"

Struggling to find a way to do a check for "quotes" before I read the line into the array. 在我将数据线读入数组之前,一直在努力寻找检查“引号”的方法。 The quotes need to be in the array. 引号需要在数组中。

If you don't mean to retain the quotes as data, use the answer at Bash: Reading quoted/escaped arguments correctly from a string instead. 如果您不打算将引号保留为数据,请使用Bash中的答案:从字符串中正确读取引用/转义参数

That said, the GNU awk extension FPAT can be used for the kind of parsing you're requesting here, if you only need to handle double-quoted strings with literal data (no \\" escaped quotes or other oddities within): 也就是说,GNU awk扩展FPAT可以用于你在这里请求的解析,如果你只需要处理带有文字数据的双引号字符串(没有\\"转义引号或其他奇怪的内容):

split_quoted_strings() {
  gawk '
    BEGIN {
      FPAT = "([^[:space:]\"]+)|(\"[^\"]+\")"
    }

    {
      printf("%d\0", NF)
      for (i = 1; i <= NF; i++) {
        printf("%s\0", $i)
      }
    }
  ' "$@"
}

# replace this with whatever you want to have called after a line has been read
handle_array() {
  echo "Read array with contents:"
  printf ' - %s\n' "$@"
  echo
}

while IFS= read -r -d '' num_fields; do
  array=( )
  for ((i=0; i<num_fields; i++)); do
    IFS= read -r -d '' piece
    array+=( "$piece" )
  done
  handle_array "${array[@]}"
done < <(split_quoted_strings)

...properly emits as output: ......正确地作为输出发出:

Read array with contents:
 - "dog"
 - 23
 - "a description of the dog"
 - 123
 - 456
 - "21"

Read array with contents:
 - "cat"
 - 5
 - "a description of the cat"
 - 987
 - 654
 - "22"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM