繁体   English   中英

计算沿行和列连接的非零点的数量,但不计算 shell 脚本中矩阵中的对角线

[英]Count the number of connected non-zeros along rows and columns but not diagonaly in a Matrix in shell script

我有一个矩阵。 例如 10 x 15 矩阵

$ cat input.txt
2  3  4  5  10 0  2  2  0  1  0  0  0  1  0
0  3  4  6  2  0  2  0  0  0  0  1  2  3  40
0  0  0  2  3  0  3  0  3  1  2  3  1  0  0
1  2  0  4  0  3  4  0  4  1  2  0  0  1  1
0  0  0  0  0  0  10 3  4  12 4  5  12 3  10
26 3  4  5  10 0  2  2  4  1  0  0  0  1  0
0  30 4  6  2  0  2  0  0  0  0  1  2  3  40
0  0  0  2  3  0  0  0  0  1  0  3  1  0  0
1  2  10 4  0  0  4  0  4  1  2  0  0  1  1
0  0  0  0  0  0  0  3  0  0  4  5  0  3  10

我正在寻找所有连接的非零值及其最大值。 请看这张图。

在此处输入图像描述

所以愿望输出是:

outfile.txt
12 10    (12 is the yellow shaded connected non-zeros with maximum value 10)
42 40    (42 is the light-red shaded connected non-zeros with maximum value 40)
 1  1    (The light purple isolated non-zero)
 2  2    (The light green connected non-zeros)
15 30    and so on
 6  5
 1  4
 4 10 
 1  3

很难开发一个合适的算法来进一步用 fortran 或 shell 脚本编写它。 我正在考虑以下算法,但无法想到下一步是什么。

step 1: #Assign the entries with a[ij], i-row, j-column
        #Now make different non-zero connected cell arrays (e.g. c1[k],c2[k],c3[k],....etc) 
        for i in {1..10};do
          for j in {1..15};do
            if [ a(i,j) != 0 ];then c1[k]=a(i,j); a(i,j)="*" #(assign a(i,j) to c1[k] and 
                                                             #replace its original value to "*" 
                                                             #because it should not be considered further)
Step 2: #Now check the left-right-up-down elements of a(i,j), if non-zero
                if [ a(i,j-1) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i,j-1); a(i,j-1)="*"
                if [ a(i,j+1) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i,j+1); a(i,j+1)="*"
                if [ a(i-1,j) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i-1,j); a(i-1,j)="*"
                if [ a(i+1,j) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i+1,j); a(i+1,j)="*"

Step 3: #continue the same process for each non-zeros until a zero at all-end.

Step 4: #Count the number of elements in c1[k] and find it maximum

这是 GNU awk 中的一个:

awk '
function check(x,y,s,   v) {                        # recursively check neighbours
    v=m[x][y]                                       # store val
    delete m[x][y]                                  # del to avoid loops
    if(((x+1) in m) && (y in m[x+1]) && m[x+1][y])  # test if neighbour exists
        check(x+1,y,s)                              # and check it
    if((x in m) && ((y+1) in m[x]) && m[x][y+1])
        check(x,y+1,s)
    if(((x-1) in m) && (y in m[x-1]) && m[x-1][y])
        check(x-1,y,s)
    if((x in m) && ((y-1) in m[x]) && m[x][y-1])
        check(x,y-1,s)
    if(v>max[s])
        max[s]=v                                    # keep max
    set[s]++                                        # count set size
}
{
    for(x=1;x<=NF;x++)                              # hash values
        m[x][NR]=$x
}
END {
    for(x in m)                                     # in no particular order
        for(y in m[x])                              
            if(m[x][y])
                check(x,y,++s)                      # start checking
    for(i in set)                                   # output
        print set[i],max[i]
}' file

输出:

12 10
2 2
15 30
42 40
1 4
1 3
6 5
1 1
4 10

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM