简体   繁体   中英

Expanding cells with more than 1 value to rows using bash

I have this file:

head test1.txt

id,name,position
123,James Marino,a
124,Charles Smith,a|b
125,Jennifer Pits,b|c|g
126,Daniel Earth,a|g

I'd like to change it, using some UNIX command such as awk, sed or grep:

id,name,position
123,James Marino,a
124,Charles Smith,a
124,Charles Smith,b
125,Jennifer Pits,b
125,Jennifer Pits,c
125,Jennifer Pits,g
126,Daniel Earth,a
126,Daniel Earth,g

Does someone know a efficient way of doing this?

awk to the rescue!

$ awk -F, -v OFS=, '{n=split($NF,a,"|");
                     for(i=1;i<=n;i++) {$NF=a[i]; print}}' file

id,name,position
123,James Marino,a
124,Charles Smith,a
124,Charles Smith,b
125,Jennifer Pits,b
125,Jennifer Pits,c
125,Jennifer Pits,g
126,Daniel Earth,a
126,Daniel Earth,g

This might work for you (GNU sed):

sed -r 's/((.*,)[^|]*)\|/\1\n\2/;P;D' file

This copies the line upto the first | and prepends it to the current line with a following newline. The current line has the character preceeding the first | removed along with its | .The first line is printed and deleted and the process repeated till all | 's have been accounted for.

A pure Bash solution:

file=test1.dat

while IFS= read -r line || [[ -n $line ]] ; do
    IFS=, read -r num name values_str <<<"$line"
    IFS='|' read -r -a values <<<"$values_str"

    # Handle empty values field (otherwise the row will not be printed)
    [[ ${#values[@]} == 0 ]] && values=( '' )

    for val in "${values[@]}" ; do
        printf '%s,%s,%s\n' "$num" "$name" "$val"
    done
done <"$file"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM