简体   繁体   中英

How to replace delimiter from a data field in a delimited file

Experts, i am trying to replace a pipe character '|' from the data field in a pipe delimited file.

Record had 12 fields and the last field has '|' in it as part of data.

Record is like-

A|B|C|D|E|F|G|H|I|J|K|TextWith|Delimiter

I want to replace this delimiter in the last field with a blank space. How do I achieve it? I tried a few awk commands but didn't get desired outcome.

Desired Outcome-

A|B|C|D|E|F|G|H|I|J|K|TextWith Delimiter

Any suggestions?

This works:

echo 'A|B|C|D|E|F|G|H|I|J|K|TextWith|Delimiter' |
 awk  'BEGIN{FS=OFS="|"}
       {$(NF-1)=$(NF-1) " " $(NF); NF=NF-1} 1'

Or sed :

echo 'A|B|C|D|E|F|G|H|I|J|K|TextWith|Delimiter' |
 sed -E 's/\|([^|]*)$/ \1/'

Or gawk (which is native on Linux):

echo 'A|B|C|D|E|F|G|H|I|J|K|TextWith|Delimiter' |
 gawk '{match($0, "(.*)\\|([^|]*$)", arr); print arr[1] " " arr[2]}'

Or Perl :

echo 'A|B|C|D|E|F|G|H|I|J|K|TextWith|Delimiter' |
 perl -lpe 's/\|([^|]*$)/ $1/'

Any prints:

A|B|C|D|E|F|G|H|I|J|K|TextWith Delimiter

You have posted twice the comment Doesn't work with the record i pasted in above comment (if there are * in the field, it goes on an tries to list all the files in current directory)

That is likely a quoting and context issue with the shell.

Consider:

$ echo *
 file file.txt powerlog

Vs:

$ echo "*"
*

The first is expanded by the shell (since the string is not quoted) and that expansion is the name of the files in the current directory. The second is the literal string *

A simple sed approach:

$ echo "A|B|C|D|E|F|G|H|I|J|K|TextWith|Delimiter" | sed 's/|/ /12'
A|B|C|D|E|F|G|H|I|J|K|TextWith Delimiter

The 12 tells it to only replace the 12th match of the regular expression on each line.

Here's another invocation with the input containing asterisks:

$ cat line
A|5|A|1|u|5|L|2|O|H|V|**** SETT|LEMENT DOCUMENTATION **** FinalOffer **** REASON : had been oot work previously **** SOURCE OF FUNDS : work **** DISCLOSURE READ : YES **** DELINQUENCY STAGE: RECOVERY **** ACCOUNT BALANCE : $2.46 **** SIF AMOUNT : $12**** PERCENTAGE : 19 % **** NUMBER OF DAYS : 128 **** PAYMENT 1: $50 DATE1: 7/21/2020

$ sed 's/|/ /12' line
A|5|A|1|u|5|L|2|O|H|V|**** SETT LEMENT DOCUMENTATION **** FinalOffer **** REASON : had been oot work previously **** SOURCE OF FUNDS : work **** DISCLOSURE READ : YES **** DELINQUENCY STAGE: RECOVERY **** ACCOUNT BALANCE : $2.46 **** SIF AMOUNT : $12**** PERCENTAGE : 19 % **** NUMBER OF DAYS : 128 **** PAYMENT 1: $50 DATE1: 7/21/2020

Here's an awk approach:

awk -F\| -v OFS=\| '{ print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12 " " $13 }'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM