I have data in this format (many lines like this):
TASK : Task 1
TASK : Task 2
TASK : Task 3
OWNER : Emp 1
OWNER : Emp 2
OWNER : Emp 3
Deadline : Monday
Deadline : Tuesday
Deadline : Wednesday
This, I want to convert to:
TASK OWNER Deadline
Task 1 Emp 1 Monday
Task 2 Emp 2 Tuesday
Task 3 Emp 3 Wednesday
Even if I can just extract each column without the column header names it'd be good. I can add the column names manually afterwards.
Is there a way to do it using 'awk' or 'sed' ?
one way with awk:
awk -F': *' '{i=NR%3;i=i?i:3;a[i]=a[i]?a[i]"\t"$2:$2}
END{for(x=1;x<=length(a);x++)print a[x]}' file
it keeps the order, omits the header line:
kent$ cat f
TASK : Task 1
TASK : Task 2
TASK : Task 3
OWNER : Emp 1
OWNER : Emp 2
OWNER : Emp 3
Deadline : Monday
Deadline : Tuesday
Deadline : Wednesday
kent$ awk -F': *' '{i=NR%3;i=i?i:3;a[i]=a[i]?a[i]"\t"$2:$2}END{for(x=1;x<=length(a);x++)print a[x]}' f
Task 1 Emp 1 Monday
Task 2 Emp 2 Tuesday
Task 3 Emp 3 Wednesday
awk -F': *' #":any <space>" as FS
'{i=NR%3;i=i?i:3; #take NR%3 in i, if i=0, set i=3. because
#we want the i=0 case at the end of the output
a[i]=a[i]?a[i]"\t"$2:$2}#concatenate the 2nd column to an array
END{for and print}' file#print the content of the array at the end
we can save the header in a var h
and print it out before go through the a (array)
:
awk -F': *' '{...h=i==1?(h?h"\t"$1:$1):h;a[i]=..}
END{print h;for...}' file
Here's a relatively nice awk
version:
BEGIN {FS=" : ";OFS="\t"}
/^TASK/ {task [tpos++] = $2}
/^OWNER/ {owner[opos++] = $2}
/^Deadline/ {due [dpos++] = $2}
END {
print "TASK", "OWNER", "DEADLINE"
for (i in task) {
print task[i],owner[i],due[i]
}
}
:)
It saves a line per block because it does not need the gsub()
call as it is using the :
as the delimiter. Store it in, lets say, test.awk
and execute it as follows:
awk -f test.awk input.txt
Update :
The above command leads to unaligned output in the shell:
TASK OWNER DEADLINE
Task 1 Emp 1 Monday
Task 2 Emp 2 Tuesday
Task 3 Emp 3 Wednesday
You can fix this using the column
command:
awk -f test.awk input.txt | column -t -s $'\t'
Now the output looks clean:
TASK OWNER DEADLINE
Task 1 Emp 1 Monday
Task 2 Emp 2 Tuesday
Task 3 Emp 3 Wednesday
Perl solution:
perl -le 'while (<>) {
chomp;
($h, $t) = split / : /;
$i++, push @{$ar[0]}, $h if $h ne $ar[0][-1];
push @{$ar[$i]}, $t;
};
$" = "\t";
print "@{$ar[0]}";
print join $", map shift @{$ar[$_]}, 1 .. $#ar while @{$ar[1]}'
If the tabs didn't align the text nicely, I'd use Text::Table .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.