I hope you can help me out with my problem.
I have an input file with 3 columns of data which looks like this:
Apl_No Act_No Sfx_No
100 10 0
100 11 1
100 12 2
100 13 3
101 20 0
101 21 1
I need to create an output file which contains the data as in the input and 3 additional fileds in its output. It should look like this:
Apl_No Act_No Sfx_No Crt_Act_No Prs_Act_No Cd_Act_No
100 10 0 - - -
100 11 1 10 11 12
100 12 2 11 12 13
100 13 3 12 13 10
101 20 0 - - -
101 21 1 20 21 20
Every Apl_No
has a set of Act_No
that are mapped to it. 3 new fields need to be created: Crt_Act_No
Prs_Act_No
Cd_Act_No
. When the first unique Apl_No
is encountered the column values 4, 5 and 6 ( Crt_Act_No
Prs_Act_No
Cd_Act_No
) need to be dashed out. For every following occurrence of the same Apl_No
the Crt_Act_No
is the same as the Act_No
on the previous line, the Prs_Act_No
is same as the Act_No
on the current line and the Cd_Act_No
is same as the Act_No
on the next line. This continues for all the following rows bearing the same Apl_No
except for the last row. In the last row the Crt_Act_No
and Prs_Act_No
is filled in the same way as the above rows but the Cd_Act_No
needs to be pulled from the Act_No
from the first row when the first unique Apl_No
is encountered.
I wish to achieve this using awk. Can anyone please help me out how to go about this.
One solution:
awk '
## Print header in first line.
FNR == 1 {
printf "%s %s %s %s\n", $0, "Crt_Act_No", "Prs_Act_No", "Cd_Act_No";
next;
}
## If first field not found in the hash means that it is first unique "Apl_No", so
## print line with dashes and save some data for use it later.
## "line" variable has the content of the previous iteration. Print it if it is set.
! apl[ $1 ] {
if ( line ) {
sub( /-/, orig_act, line );
print line;
line = "";
}
printf "%s %s %s %s\n", $0, "-", "-", "-";
orig_act = prev_act = $2;
apl[ $1 ] = 1;
next;
}
## For all non-unique "Apl_No"...
{
## If it is the first one after the line with
## dashes (line not set) save it is content in "line" and the variable
## that I will have to check later ("Act_No"). Note that I leave a dash in last
## field to substitute in the following iteration.
if ( ! line ) {
line = sprintf( "%s %s %s %s", $0, prev_act, $2, "-" );
prev_act = $2;
next;
}
## Now I know the field, so substitute the dash with it, print and repeat
## the process with current line.
sub( /-/, $2, line );
print line;
line = sprintf( "%s %s %s %s", $0, prev_act, $2, "-" );
prev_act = $2;
}
END {
if ( line ) {
sub( /-/, orig_act, line );
print line;
}
}
' infile | column -t
That yields:
Apl_No Act_No Sfx_No Crt_Act_No Prs_Act_No Cd_Act_No
100 10 0 - - -
100 11 1 10 11 12
100 12 2 11 12 13
100 13 3 12 13 10
101 20 0 - - -
101 21 1 20 21 20
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.