简体   繁体   English

使用bash将值大于1的单元格扩展到行

[英]Expanding cells with more than 1 value to rows using bash

I have this file: 我有这个文件:

head test1.txt

id,name,position
123,James Marino,a
124,Charles Smith,a|b
125,Jennifer Pits,b|c|g
126,Daniel Earth,a|g

I'd like to change it, using some UNIX command such as awk, sed or grep: 我想使用一些UNIX命令,例如awk,sed或grep来更改它:

id,name,position
123,James Marino,a
124,Charles Smith,a
124,Charles Smith,b
125,Jennifer Pits,b
125,Jennifer Pits,c
125,Jennifer Pits,g
126,Daniel Earth,a
126,Daniel Earth,g

Does someone know a efficient way of doing this? 有人知道这样做的有效方法吗?

awk to the rescue! awk解救!

$ awk -F, -v OFS=, '{n=split($NF,a,"|");
                     for(i=1;i<=n;i++) {$NF=a[i]; print}}' file

id,name,position
123,James Marino,a
124,Charles Smith,a
124,Charles Smith,b
125,Jennifer Pits,b
125,Jennifer Pits,c
125,Jennifer Pits,g
126,Daniel Earth,a
126,Daniel Earth,g

This might work for you (GNU sed): 这可能对您有用(GNU sed):

sed -r 's/((.*,)[^|]*)\|/\1\n\2/;P;D' file

This copies the line upto the first | 这会将行复制到第一个| and prepends it to the current line with a following newline. 并用下一个换行符将其添加到当前行。 The current line has the character preceeding the first | 当前行的字符在第一个| removed along with its | 连同其|一起删除 .The first line is printed and deleted and the process repeated till all | 。第一行将被打印并删除,然后重复该过程,直到全部| 's have been accounted for. 的已占。

A pure Bash solution: 一个纯Bash解决方案:

file=test1.dat

while IFS= read -r line || [[ -n $line ]] ; do
    IFS=, read -r num name values_str <<<"$line"
    IFS='|' read -r -a values <<<"$values_str"

    # Handle empty values field (otherwise the row will not be printed)
    [[ ${#values[@]} == 0 ]] && values=( '' )

    for val in "${values[@]}" ; do
        printf '%s,%s,%s\n' "$num" "$name" "$val"
    done
done <"$file"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM