简体   繁体   中英

Script to add and remove columns depending on the variable length of the line in a file

I have n number of records in a file karan.csv in the following format:

A=9607738162|B=9607562681|C=20200513191434|D=|F=959852599|G=MT|H=4012|I=4012|J=9607562681|K=947100410|
A=960299773008|B=9607793008|C=20200513191327|D=|E=ST|F=959852599|G=MO|H=2001|I=2001|J=9607793008|K=947100180|
A=9607704530|B=9607839496|C=20200513191730|D=|F=959852599|G=MT|I=5012|J=9607839496|K=|

Now if we notice, the number of columns are: 10, 11 & 9 respectively. This count is random within the file, however, the number of columns will remain the same.

Now, I wan to create a script that will remove $5 from that column (including the delimiter) if there are 11 columns in a line such that it looks exactly like the row with 10 columns

A=9607738162|B=9607562681|C=20200513191434|D=|F=959852599|G=MT|H=4012|I=4012|J=9607562681|K=947100410|

and, that adds "H=|" in $7 where the column count is 9

A=9607704530|B=9607839496|C=20200513191730|D=|F=959852599|G=MT|H=|I=5012|J=9607839496|K=|

Now I wrote the following code to achieve it:

for text in $(cat /tmp/karan.csv);do
  count=`awk -F"|" '{print NF-1}' $text`
  if [ $count == 9 ]
  then
  awk  'BEGIN{FS=OFS="|"}{$7="|H"}1' $text >> /tmp/karantest2.csv
  elif [ $count == 10 ]
  then
  echo $text >> /tmp/karantest2.csv
  else
  awk -F"|" '{print $1,$2,$3,$4,$6,$7,$8,$9,$10,$11}' $text >> /tmp/karantest2.csv
  fi
  done

But after debugging, I realised the script was not moving ahead after:

count=`awk -F"|" '{print NF-1}' $text`

Can any one please me regarding the same.

Regards

A sed solution, which first inserts H=| on lines with 9 columns, then removes the 7th column on lines with 11 columns:

sed -E '/^([^\|]+\|){9}$/s/(([^\|]+\|){6})/\1H=\|/;/^([^\|]+\|){11}$/s/(([^\|]+\|){4})[^\|]+\|/\1/ inputfile

If you need a POSIX-compliant command, then

  • since -E is not POSIX, you have to escape every ( , ) , { , } , + (and other special characters, which are not in this command), and un-escape \| to make it literal;
  • since \+ is not POSIX either, you need to use the more verbose \{1,\} .

Here's the POSIX-compliant command:

sed '/^\([^|]\{1,\}|\)\{9\}$/s/\(\([^|]\{1,\}|\)\{6\}\)/\1H=|/;/^\([^|]\{1,\}|\)\{11\}$/s/\(\([^|]\{1,\}|\)\{4\}\)[^|]\{1,\}|/\1/' inputfile

A pure awk solution:

awk -F'|' '

BEGIN { OFS="|" }

NF==10 { print $1, $2, $3, $4, $5, $6, "H=", $7, $8, $9, $10 }
NF==11 { print $0 }
NF==12 { print $1, $2, $3, $4, $6, $7, $8, $9, $10, $11, $12 }

' karen.csv

Output for the sample input provided is:

A=9607738162|B=9607562681|C=20200513191434|D=|F=959852599|G=MT|H=4012|I=4012|J=9607562681|K=947100410|
A=960299773008|B=9607793008|C=20200513191327|D=|F=959852599|G=MO|H=2001|I=2001|J=9607793008|K=947100180|
A=9607704530|B=9607839496|C=20200513191730|D=|F=959852599|G=MT|H=|I=5012|J=9607839496|K=|

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM