简体   繁体   中英

How can I compare two 2D-array files with bash?

I have two 2D-array files to read with bash .

What I want to do is extract the elements inside both files.

These two files contain different rows x columns such as:

file1.txt (nx7)

NO DESC ID TYPE W S GRADE
1 AAA 20 AD 100 100 E2 
2 BBB C0 U 200 200 D 
3 CCC 9G R 135 135 U1 
4 DDD 9H Z 246 246 T1 
5 EEE 9J R 789 789 U1 
.
.
.

file2.txt (mx3)

DESC W S 
AAA 100 100 
CCC 135 135
EEE 789 789
.
.
.

Here is what I want to do:

  1. Extract the element in DESC column of file2.txt then find the corresponding element in file1.txt .

  2. Extract the W,S elements in such row of file2.txt then find the corresponding W,S elements in such row of file1.txt .

  3. If [W1==W2 && S1==S2]; then echo "${DESC[colindex]} ok"; else echo "${DESC[colindex]} NG"

How can I read this kind of file as a 2D array with bash or is there any convenient way to do that?

bash does not support 2D arrays. You can simulate them by generating 1D array variables like array1 , array2 , and so on.

Assuming DESC is a key (ie has no duplicate values) and does not contain any spaces:

#!/bin/bash

# read data from file1
idx=0
while read -a data$idx; do
    let idx++
done <file1.txt

# process data from file2
while read desc w2 s2; do
    for ((i=0; i<idx; i++)); do
        v="data$i[1]"
        [ "$desc" = "${!v}" ] && {
            w1="data$i[4]"
            s1="data$i[5]"
            if [ "$w2" = "${!w1}" -a "$s2" = "${!s1}" ]; then
                echo "$desc ok"
            else
                echo "$desc NG"
            fi
            break
        }
    done
done <file2.txt

For brevity, optimizations such as taking advantage of sort order are left out.

If the files actually contain the header NO DESC ID TYPE ... then use tail -n +2 to discard it before processing.

A more elegant solution is also possible, which avoids reading the entire file in memory. This should only be relevant for really large files though.

If the rows order is not needed be preserved (can be sorted), maybe this is enough:

join -2 2 -o 1.1,1.2,1.3,2.5,2.6 <(tail -n +2 file2.txt|sort) <(tail -n +2 file1.txt|sort) |\
    sed 's/^\([^ ]*\) \([^ ]*\) \([^ ]*\) \2 \3/\1 OK/' |\
    sed '/ OK$/!s/\([^ ]*\) .*/\1 NG/'

For file1.txt

NO DESC ID TYPE W S GRADE
1 AAA 20 AD 100 100 E2 
2 BBB C0 U 200 200 D 
3 CCC 9G R 135 135 U1 
4 DDD 9H Z 246 246 T1 
5 EEE 9J R 789 789 U1 

and file2.txt

DESC W S 
AAA 000 100 
CCC 135 135
EEE 789 000
FCK xxx 135

produces:

AAA NG
CCC OK
EEE NG

Explanation:

  • skip the header line in both files - tail +2
  • sort both files
  • join the needed columns from both files into one table like, in the result will appears only the lines what has common DESC field

like next:

AAA 000 100 100 100
CCC 135 135 135 135
EEE 789 000 789 789
  • in the lines, which have the same values in 2-4 and 3-5 columns, substitute every but 1st column with OK
  • in the remainder lines substitute the columns with NG

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM