简体   繁体   中英

Bash - Built a CSV File from TXT

I'm newbie in using bash and grep ... I am trying to output a CSV file from a TXT file that contains this lines:

Input:

1. Fisrt - Name: Joanna Last - Name: Yang
Place of birth: Paris Date of birth: 01/01/1972 Sex: F
Number: 0009876541234567
2. Fisrt - Name: Bob Last - Name: Lee
Place of birth: London Date of birth: 05/08/1969 Sex: M
Number: 0005671890765223

Output:

"Joanna","Yang","Paris","01/01/1972","F","0009876541234567"
"Bob","Lee","London","05/08/1969","M","0005671890765223"

Any suggestions would be appreciated!!!!

Using only one regex with grep won't be easy.
You can try with multiple regexs and concat the results.

For instance:
To get the first names you can use this regex : "Fisrt - Name: ([a-zA-Z]+)" .
Save this into a variable.

Next to get the birth dates you can use "birth: ([0-9]+\\/[0-9]+\\/+[0-9]+)" .
Save this into a variable.

Do it for each part and concatenate the results with a coma.

Its clearly not the best way but it's a start. To help with regex you can use https://regex101.com/ .

Maybe try using the sed command line

If your file is nice and nice formatted, no regex are needed.
We can read three lines at a time and split them on spaces - we are interested in only specified fields. If you can "assert" that no fields from the file will have spaces (I think no valid human name has spaces in it... right?), you can just do this:

while
    IFS=' ' read -r _ _ _ _ name _ _ _ last &&
    IFS=' ' read -r _ _ _ birthplace _ _ _ birthdate _ sex &&
    IFS=' ' read -r _ number
do
    printf '"%s","%s","%s","%s","%s","%s"\n' \
        "$name" "$last" "$birthplace" "$birthdate" "$sex" "$number"
done <input

Live version available at onlinedbg .

In one line:

~ $ cat yourfile.txt 
1. Fisrt - Name: Joanna Last - Name: Yang
Place of birth: Paris Date of birth: 01/01/1972 Sex: F
Number: 0009876541234567
2. Fisrt - Name: Bob Last - Name: Lee
Place of birth: London Date of birth: 05/08/1969 Sex: M
Number: 0005671890765223
~ $ sed -r "s/^.*Fisrt - Name: (.*) Last - Name: (.*)$/\1,\2;/g" yourfile.txt | sed -r "s/^Place of birth: (.*) Date of birth: (.*) Sex: (.*)$/\1,\2,\3;/g" | sed -r "s/^Number: (.*)$/\1/g" | sed -n 'H;${x;s/;\n/,/g;s/^,//;p;}' | tail -n +2 > yourfile.csv
~ $ cat yourfile.csv 
Joanna,Yang,Paris,01/01/1972,F,0009876541234567
Bob,Lee,London,05/08/1969,M,0005671890765223
~ $ 

Hope it helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM