简体   繁体   中英

Replace characters with awk

I have the following file:

61 12451
61 13451
61 14451
61 15415
12 48469
12 78456
12 47845 
32 45778
32 48745
32 47845
32 52448
32 87451

The output I want is the following, for example, 61 s are replaced by 1 as they are the first occurrence and they are repeated 4 times, then the second column goes from 2 to 5, as these are pairwise comparisons, 1 to 1 is ignored, but the second column should start from 2, so on for the rest.

1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
3 6
3 7
3 8

Any suggestion on how to achieve this with AWK? Thanks!

It could be written in one awk command like this

awk '{a[NR]=$1;b[NR]=$2;c[NR]=$1;d[NR]=$2} END {for(i=1; i<=NR; i++){if(i==1){c[i]=1;d[i]=2}else if(a[i]==a[i-1]){c[i]=c[i-1];d[i]=1+d[i-1]}else{c[i]=1+c[i-1];d[i]=c[i]+1}print c[i],d[i]}}' pairwise.txt > output.txt

Here a and b are the arrays that read the first and second column of the file. The new values are stored in arrays c and d as first & second column and are printed to the output file.

not sure if this one-liner helps:

awk '$1!=p{++i;j=i+1}{print i,j++;p=$1}' file

at least it gives the desired output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM