简体   繁体   中英

reset row number count in awk

I have a file like this

file.txt

0   1   a
1   1   b
2   1   d
3   1   d
4   2   g
5   2   a
6   3   b
7   3   d
8   4   d
9   5   g
10   5   g
.
.
.

I want reset row number count to 0 in first column $1 whenever value of field in second column $2 changes, using awk or bash script.

result

0   1   a
1   1   b
2   1   d
3   1   d
0   2   g
1   2   a
0   3   b
1   3   d
0   4   d
0   5   g
1   5   g
.
.
. 

只要你不介意一点多余的内存使用,并且第二列是排序的,我认为这是最有趣的:

awk '{$1=a[$2]+++0;print}' input.txt

This awk one-liner seems to work for me:

[ghoti@pc ~]$ awk 'prev!=$2{first=0;prev=$2} {$1=first;first++} 1' input.txt
0 1 a
1 1 b
2 1 d
3 1 d
0 2 g
1 2 a
0 3 b
1 3 d
0 4 d
0 5 g
1 5 g

Let's break apart the script and see what it does.

  • prev!=$2 {first=0;prev=$2} -- This is what resets your counter. Since the initial state of prev is empty, we reset on the first line of input, which is fine.
  • {$1=first;first++} -- For every line, set the first field, then increment variable we're using to set the first field.
  • 1 -- this is awk short-hand for "print the line". It's really a condition that always evaluates to "true", and when a condition/statement pair is missing a statement, the statement defaults to "print".

Pretty basic, really.

The one catch of course is that when you change the value of any field in awk, it rewrites the line using whatever field separators are set, which by default is just a space. If you want to adjust this, you can set your OFS variable:

[ghoti@pc ~]$ awk -vOFS="   " 'p!=$2{f=0;p=$2}{$1=f;f++}1' input.txt | head -2
0   1   a
1   1   b

Salt to taste.

A pure solution :

file="/PATH/TO/YOUR/OWN/INPUT/FILE"

count=0
old_trigger=0

while read a b c; do
    if ((b == old_trigger)); then
        echo "$((count++)) $b $c"
    else
        count=0
        echo "$((count++)) $b $c"
        old_trigger=$b
    fi

done < "$file"

This solution (IMHO) have the advantage of using a readable algorithm. I like what's other guys gives as answers, but that's not that comprehensive for beginners.

NOTE :

((...)) is an arithmetic command, which returns an exit status of 0 if the expression is nonzero, or 1 if the expression is zero. Also used as a synonym for let , if side effects (assignments) are needed. See http://mywiki.wooledge.org/ArithmeticExpression

Perl solution:

perl -naE '
    $dec  =  $F[0] if defined $old and $F[1] != $old;
    $F[0] -= $dec;
    $old  =  $F[1];
    say join "\t", @F[0,1,2];'

$dec is subtracted from the first column each time. When the second column changes (its previous value is stored in $old ), $dec increases to set the first column to zero again. The defined condition is needed for the first line to work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM