使用 awk 创建表外矩阵

Question

I want to use this table:我想使用这张表：

a   16  moe max us
b   11  tom mic us
d   14  roe fox au
t   29  ann teo au
n   28  joe joe ca

and make this matrix by using awk (or any other simple option in bash):并使用 awk（或 bash 中的任何其他简单选项）制作此矩阵：

    a_16;   b_11;   d_14;   t_29;   n_28
us; moe_max;    tom_mic;    ;   ;       
au; ;   ;   roe_fox;    ann_teo;    
ca; ;   ;   ;   ;   joe_joe

I tried this but it didn't work:我试过这个但没有用：

awk '{a[$5]=a[$5]?a[$5] FS $1"_"$2:$1"_"$2; b[$5]=b[$5]?b[$5] FS $3"_"$4:$3"_"$4;} END{for (i in a){print i"\t" a[i] "\t" b[i];}}' fis.txt

Answer 1

Using any awk使用任何awk

$ cat tst.awk
{
    row           = $NF
    col           = $1 "_" $2
    vals[row,col] = $3 "_" $4
}

!seenRow[row]++ { rows[++numRows] = row }
!seenCol[col]++ { cols[++numCols] = col }

END {
    OFS = ";  "

    printf "     "
    for ( colNr=1; colNr<=numCols; colNr++ ) {
        col = cols[colNr]
        printf "%s%s", col, (colNr<numCols ? OFS : ORS)
    }

    for ( rowNr=1; rowNr<=numRows; rowNr++ ) {
        row = rows[rowNr]
        printf "%s%s", row, OFS
        for ( colNr=1; colNr<=numCols; colNr++ ) {
            col = cols[colNr]
            #val = ((row,col) in vals ? vals[row,col] : "  ")
            val = vals[row,col]
            printf "%s%s", val, (colNr<numCols ? OFS : ORS)
        }
    }
}

$ awk -f tst.awk file
     a_16;  b_11;  d_14;  t_29;  n_28
us;  moe_max;  tom_mic;  ;  ;
au;  ;  ;  roe_fox;  ann_teo;
ca;  ;  ;  ;  ;  joe_joe

I can't see the pattern in the expected output in your question of when there should be 1, 2, 3, or 4 spaces after each ;在你的问题中，我看不到预期的 output 中的模式，即每个后面应该有 1、2、3 或 4 个空格; so I just used a consistent 2 in the above.所以我只是在上面使用了一致的 2。 Massage it to suit.按摩它以适应。

Answer 2

Using gawk multidimensional arrays for collecting header columns and row indices:使用gawk multidimensional arrays 收集 header 列和行索引：

awk '{
    head[NR] = $1"_"$2;
    idx[$5][NR] = $3"_"$4
}
END {
    h = ""; col_size = length(head);
    for (i = 1; i <= col_size; i++) {
        h = sprintf("%s  %s", h, head[i])
    }
    print h;
    for (lab in idx) {
        printf("%s", lab);
        for (i = 1; i <= col_size; i++) {
            v = sprintf("%s;  %s", v, idx[lab][i])
        }
        print v;
        v = "";
    }
}' test.txt

  a_16  b_11  d_14  t_29  n_28
ca;  ;  ;  ;  ;  joe_joe
au;  ;  ;  roe_fox;  ann_teo;  
us;  moe_max;  tom_mic;  ;  ;

Answer 3

Here is a ruby to do that:这是一个 ruby 来做到这一点：

ruby -e 'd=$<.read.
    split(/\R/).
    map(&:split).
    map{|sa| sa.each_slice(2).map{|ss| ss.join("_") } }.
    group_by{|sa| sa[-1] }

# {"us"=>[["a_16", "moe_max", "us"], ["b_11", "tom_mic", "us"]], "au"=>[["d_14", "roe_fox", "au"], ["t_29", "ann_teo", "au"]], "ca"=>[["n_28", "joe_joe", "ca"]]}

heads=d.values.flatten(1).map{|sa| sa[0]}
# ["a_16", "b_11", "d_14", "t_29", "n_28"]

hsh=Hash.new {|h,k| h[k] = ["\t"]*heads.length}
d.each{|k,v| 
    v.each{|sa| 
        hsh[k][heads.index(sa[0])]="\t#{sa[1]}"
    }
}
puts heads.map{|e| "\t#{e}" }.join(";")
hsh.each{|k,v| puts "#{k};\t#{v.join(";")}"}
' file

Prints:印刷：

    a_16;   b_11;   d_14;   t_29;   n_28
us;     moe_max;    tom_mic;    ;   ;   
au;     ;   ;   roe_fox;    ann_teo;    
ca;     ;   ;   ;   ;   joe_joe

使用 awk 创建表外矩阵

问题描述

3 个解决方案

解决方案1
2 2023-01-28 13:20:25

解决方案2
1 2023-01-28 10:28:25

解决方案3
0 2023-01-28 18:25:06

使用 awk 创建表外矩阵

问题描述

3 个解决方案

解决方案1 2 2023-01-28 13:20:25

解决方案2 1 2023-01-28 10:28:25

解决方案3 0 2023-01-28 18:25:06

解决方案1
2 2023-01-28 13:20:25

解决方案2
1 2023-01-28 10:28:25

解决方案3
0 2023-01-28 18:25:06