简体   繁体   中英

Strange behaviour while subtracting 2 string arrays

I am subtracting array1 from array2 My 2 arrays are

array1=(apps argocd cache core dev-monitoring-busk test-ci-cd)
array2=(apps argocd cache core default kube-system kube-public kube-node-lease monitoring)

And the way Im subtracting them is

for i in "${array2[@]}"; do
         array1=(${array1[@]//$i})
done

echo ${array1[@]}

Now my expected result should be

dev-monitoring-busk test-ci-cd

But my expected result is

dev--busk test-ci-cd

Although the subtraction looks good but its also deleting the string monitoring from dev-monitoring-busk . I dont understand why. Can some point out whats wrong here?

I know that there are other solutions out there for a diff between 2 arrays like

echo ${Array1[@]} ${Array2[@]} | tr ' ' '\n' | sort | uniq -u

But this is more of a diff and not a subtraction. So this does not work for me.

Bit of a kludge but it works...

  • use comm to find those items unique to a (sorted) data set
  • use tr to convert between spaces (' ' == array element separator) and carriage returns ('\n'; comm works on individual lines)
  • echo "${array1[@]}" | tr ' ' '\n' | sort echo "${array1[@]}" | tr ' ' '\n' | sort : convert an array's elements into separate lines and sort
  • comm -23 (sorted data set #1) (sorted data set #2) : compare sorted data sets and return the rows that only exist in data set #1

Pulling this all together gives us:

$ array1=(apps argocd cache core dev-monitoring-busk test-ci-cd)
$ array2=(apps argocd cache core default kube-system kube-public kube-node-lease monitoring)

# find rows that only exist in array1

$ comm -23 <(echo "${array1[@]}" | tr ' ' '\n' | sort) <(echo "${array2[@]}" | tr ' ' '\n' | sort)
dev-monitoring-busk
test-ci-cd

# same thing but this time replace carriage returns with spaces (ie, pull all items onto a single line of output):

$ comm -23 <(echo "${array1[@]}" | tr ' ' '\n' | sort) <(echo "${array2[@]}" | tr ' ' '\n' | sort) | tr '\n' ' '
dev-monitoring-busk test-ci-cd

NOTEs about comm :

- takes 2 sorted data sets as input
- generates 3 columns of output:
    - (output column #1) rows only in data set #1
    - (output column #2) rows only in data set #2
    - (output column #3) rows in both data sets #1 and #2
- `comm -xy` ==> discard ouput columns 'x' and 'y'
    - `comm -12` => discard output columns #1 and #2 => only show lines common to both data sets (output column #3)
    - `comm -23' => discard output columns #2 and #3 => only show lines that exist in data set #1 (output column #1)

If I'm understanding correctly, what you want is not to subtract array1 from array2 , but to subtract array2 from array1 . As others are pointing out, bash replacement do not work with arrays. Instead you can make use of an associative array if your bash version >= 4.2.

Please try the following:

declare -a array1=(apps argocd cache core dev-monitoring-busk test-ci-cd)
declare -a array2=(apps argocd cache core default kube-system kube-public kube-node-lease monitoring)

declare -A mark
declare -a ans

for e in "${array2[@]}"; do
    mark[$e]=1
done

for e in "${array1[@]}"; do
    [[ ${mark[$e]} ]] || ans+=( "$e" )
done

echo "${ans[@]}"
  • It first iterate over array2 and marks its elements by using an associative arrray mark .
  • It then iterates over array1 and add the element to the answer if it is not seen in the mark .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM