简体   繁体   中英

Subtract single largest number from multiple specific columns in awk

I have a comma delimited file that looks like

R,F,TE,K,G,R
1,0,12,f,1,18
2,1,17,t, ,17
3,1,  , ,1,
4,0,15, ,0,16

There are some items which are missing, also first row is the header which I want to ignore. I wanted to calculate the second smallest number in specific columns and subtract it from all the elements in that column unless the value in the column is the minimum value. In this example, I want to subtract the second minimum values from columns 3 and 6 in the example. So, my final values would be:

R,F,TE,K,G,R
1,0,12,f,1,1
2,1, 2,t, ,0
3,1, , ,0,
4,0, 0, ,0,16

I tried individually using single columns and giving hand-coded thresholds to make it second largest by

awk 'BEGIN {FS=OFS=","; 
};
{ min=1000000; 
 if($3<min && $3 != "" && $3>12) min = $3; 
 if($3>0) $3 = $3-min+1;
 print}
 END{print min}
 ' try1.txt

It finds the min alright but the output is not as expected. There should be an easier way in awk.

I'd loop over the file twice, once to find the minima, once to adjust the values. It's a trade-off of time versus memory.

awk -F, -v OFS=, '
    NR == 1    {min3 = $3; min6 = $6} 
    NR == FNR  {if ($3 < min3) min3 = $3; if ($6 < min6) min6 = $6; next}
    $3 != min3 {$3 -= min3}
    $6 != min6 {$6 -= min6}
    {print}
' try1.txt try1.txt

For prettier output:

awk -F, -v OFS=, '
    NR == 1    {min3 = $3; min6 = $6; next}
    NR == FNR  {if ($3 < min3) min3 = $3; if ($6 < min6) min6 = $6; next}
    FNR == 1   {len3 = length("" min3); len6 = length("" min6)}
    $3 != min3 {$3 = sprintf("%*d", len3, $3-min3)}
    $6 != min6 {$6 = sprintf("%*d", len6, $6-min6)}
    {print}
' try1.txt try1.txt

Given the new requirements:

min2_3=$(cut -d, -f3 try1.txt | tail -n +2 | sort -n | grep -v '^ *$' | sed -n '2p')
min2_6=$(cut -d, -f6 try1.txt | tail -n +2 | sort -n | grep -v '^ *$' | sed -n '2p')

awk -F, -v OFS=, -v min2_3=$min2_3 -v min2_6=$min2_6 '
    NR==1 {print; next}
    $3 !~ /^ *$/ && $3 >= min2_3 {$3 -= min2_3}
    $6 !~ /^ *$/ && $6 >= min2_6 {$6 -= min2_6}
    {print}
' try1.txt
R,F,TE,K,G,R
1,0,12,f,1,1
2,1,2,t, ,0
3,1,  , ,1,
4,0,0, ,0,16
BEGIN{
    FS=OFS=","
}
{
    if(NR==1){print;next}
    if(+$3)a[NR]=$3
    if(+$6)b[NR]=$6
    s[NR]=$0
}
END{
    asort(a,c)
    asort(b,d)
    for(i=2;i<=NR;i++){
        split(s[i],t)
        if(t[3]!=c[1]&&+t[3]!=0)t[3]=t[3]-c[2]
        if(t[6]!=d[1]&&+t[6]!=0)t[6]=t[6]-d[2]
        print t[1],t[2],t[3],t[4],t[5],t[6]
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM