My file is like this
A0010 A R G 222
ALBXXXXXLE DRIVE - NO N1 Y 2 C 1 0
A R G BOBBY BEARD 1 NC N N 0 0.00
AERXXXX 0.00
NC 22211
A0013
A & A SERVICE CENTER P O BOX 113 - NO N1 Y 2 C 1 0
A & A SERVICE CENTER 1 NC N Y 0 0.00
HARRELLSVILLE 0.00
NC 27942
A0016 A HOME GARDEN SHOP 111 E MAIN STREET 111-111-1110 NO N1 Y 2 U 1 0
HOME GARDEN SHOP PAM 1 NC N Y 0 0.00
AERBDER 0.00
NC 24520
A0039 XXXXXXX HILL APTS. P.O. BOX 604 222-7111 NO N1 Y 2 U 1 0
XXXXXXX HILL APTS. TXXXMAN MORRIS 1 NC Y Y 0 0.00
AERBDER 0.00
NC 27510
I want to separate each record using the first column A0010, A0013, A0016, A0039 and load into database. I tried using awk, but it took only the first matching as record separate.
cat temp1 | gawk 'BEGIN {RS="^[A-Z][0-9][0-9][0-9][0-9]";} {print NR,"and RT=" RT}' | sed -e 's/ \+/ /g'
o/p
1 and RT=A0010
2 and RT=
It is not taking the 2nd match. Please help
Replace your awk command with the following:
cat temp1 | awk 'BEGIN {RS="[A-Z][0-9][0-9][0-9][0-9]";} {print NR,"and RT=" RT}'
The ^
is causing your problem.
Edit (based on the comments):
If the pattern occurs at the beginning and in the middle of lines:
grep -E "^[A-Z][0-9]{3}" temp1 | gawk 'BEGIN {RS="[A-Z][0-9][0-9][0-9][0-9]";} {print NR,"and RT=" RT}'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.