简体   繁体   中英

how to sum up matrices in multiple files using bash or awk

If I have an arbitrary number of files, say n files, and each file contains a matrix, how can I use bash or awk to sum up all the matrices in each file and get an output?

For example, if n=3, and I have these 3 files with the following contents

$ cat mat1.txt

1 2 3
4 5 6
7 8 9

$cat mat2.txt

1 1 1
1 1 1
1 1 1

$ cat mat3.txt

2 2 2 
2 2 2
2 2 2

I want to get this output:

$ cat output.txt

4 5 6
7 8 9
10 11 12

Is there a simple one liner to do this?

Thanks!

You can use awk with paste :

awk -v n=3 '{for (i=1; i<=n; i++) printf "%s%s", ($i + $(i+n) + $(i+n*2)), 
            (i==n)?ORS:OFS}' <(paste mat{1,2,3}.txt)
4 5 6
7 8 9
10 11 12
$ awk '{for (i=1;i<=NF;i++) total[FNR","i]+=$i;} END{for (j=1;j<=FNR;j++) {for (i=1;i<=NF;i++) printf "%3i ",total[j","i]; print "";}}' mat1.txt mat2.txt mat3.txt
  4   5   6 
  7   8   9 
 10  11  12 

This will automatically adjust to different size matrices. I don't believe that I have used any GNU features so this should be portable to OSX and elsewhere.

How it works:

This command reads from each line from each matrix, one matrix at a time.

  • For each line read, the following command is executed:

     for (i=1;i<=NF;i++) total[FNR","i]+=$i 

    This loops over every column on the line and adds it to the array total .

    GNU awk has multidimensional arrays but, for portability, they are not used here. awk's arrays are associative and this creates an index from the file's line number, FNR , and the column number i , by combining them together with a comma. The result should be portable.

  • After all the matrices have been read, the results in total are printed:

     END{for (j=1;j<=FNR;j++) {for (i=1;i<=NF;i++) printf "%3i ",total[j","i]; print ""}} 

    Here, j loops over each line up to the total number of lines, FNR . Then i loops over each column up to the total number of columns, NF . For each row and column, the total is printed via printf "%3i ",total[j","i] . This prints the total as a 3-character-wide integer. If you numbers are float or are bigger, adjust the format accordingly.

    At the end of each row, the print "" statement causes a newline character to be printed.

GNU awk has multi-dimensional arrays.

gawk '
    {
        for (i=1; i<=NF; i++) 
            m[i][FNR] += $i
    } 
    END {
        for (y=1; y<=FNR; y++) {
            for (x=1; x<=NF; x++)
                printf "%d ", m[x][y]
            print ""
        }
    }
' mat{1,2,3}.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM