简体   繁体   中英

Separate string of digits into 3 columns using awk/sed

I have a string of digits in rows as below:

6390212345678912011012112121003574820069121409100000065471234567810
6390219876543212011012112221203526930428968109100000065478765432196

That I need to split into 6 columns as below:

639021234567891,201101211212100,3574820069121409,1000000,654712345678,10
639021987654321,201101211222120,3526930428968109,1000000,654787654321,96

Conditions:

  • Field 1 = 15 Char
  • Field 2 = 15 Char
  • Field 3 = 15 or 16 Char
  • Field 4 = 7 Char
  • Field 5 = 12 Char
  • Field 6 = 2 Char

Final Output:

639021234567891,3574820069121409,654712345678
639021987654321,3526930428968109,654787654321

It's not clear how detect whether field 3 should have 15 or 16 chars. But as draft for the first 3 fields you could use something like that:

     echo 63902910069758520110121121210035748200670169758510 |
 awk '{ printf("%s,%s,%s",substr($1,1,15),substr($1,16,15),substr($1,30,15)); }'

Or with sed:

echo $NUM | sed -r 's/^([0-9]{16})([0-9]{15})([0-9]{15,16}) ...$/\1,\2,\3, .../'

This will use 15 or 16 for the length of field 3 based the length of the whole string.

If you're using gawk :

gawk -v f3w=16 'BEGIN {OFS=","; FIELDWIDTHS="15 15 " f3w " 7 12 2"} {print $1, $3, $5}'

Do you know ahead of time what the width of Field 3 should be? Do you need it to be programatically determined? How? Based on the total length of the line? Does it change line-by-line?

Edit:

If you don't have gawk , then this is a similar approach:

awk -v f3w=16 'BEGIN {OFS=","; FIELDWIDTHS="15 15 " f3w " 7 12 2"; n=split(FIELDWIDTHS,fw," ")} { p=1; r=$0; for (i=1;i<=n;i++) { $i=substr(r,p,fw[i]); p += fw[i]}; print $1,$3,$5}'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM