I have a text file, I want to copy it into CSV file and after that CSV file copy to PostgreSQL table.
My input text file is(old_sample.txt) ,
SVCOP,"12980","2019"0627","1DEX","LUBE, OIL & FILTER - DEXOS "1"","I","0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS ''1''","91","LANE","LANE","L","LA MERE","125.00","125.00","","0.00","0.00","0","0","0","||||||||||||||||||||||||","N"
I have to use the below code
cat old_sample.txt
printf "\n"
echo "____________________________________"
printf "\n"
cat old_sample.txt | sed ': again
s/\("[^",]*\)"\([^",]*"\)/\1\2/g
t again
s/""/"/g'
Output is
SVCOP,"12980","2019"0627","1DEX","LUBE, OIL & FILTER - DEXOS "1"","I","0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS ''1''","91","LANE","LANE","L","LA MERE","125.00","125.00","","0.00","0.00","0","0","0","||||||||||||||||||||||||","N"
SVCOP,"12980","20190627","1DEX","LUBE, OIL & FILTER - DEXOS "1","I","0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS ''1''","91","LANE","LANE","L","LA MERE","125.00","125.00",","0.00","0.00","0","0","0","||||||||||||||||||||||||","N"
The problem is "LUBE, OIL & FILTER - DEXOS "1""
"1" this double quotes not removed due to comma is present inside the double quotes but "2019"0627" this works fine so I want to remove all double quotes inside string enclosed in open and closed double-quotes.otherwise it will show a database error.
This is my code
nl -ba -nln -s, < old_sample.txt | sed ': again
s/\("[^",]*\)"\([^",]*"\)/\1\2/g
t again' | grep 'SVCPTS' > old_sample.csv
psql_local <<SQL || die "Failed to import parts data"
\copy sample_table from 'old_sample.csv' with (format csv, header false)
SQL
My target output is
SVCOP,"12980","20190627","1DEX","LUBE, OIL & FILTER - DEXOS 1","I","0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS ''1''","91","LANE","LANE","L","LA MERE","125.00","125.00","","0.00","0.00","0","0","0","||||||||||||||||||||||||","N"
Personally, if I were doing this, I would reach for a utility program. I think you may be able to achieve it by finding the right RegEx - but it might end up being quite complex.
Using something like csvkit - specifically, the csvformat command seems a lot easier. It would also be more reliable if you need to re-use this script with other data in the future (which could have newlines in some fields, or other situations you may need to account for).
Would you please try the following:
while IFS= read -r str; do # assign a variable "str" to a line
while true; do # infinite loop
str2=$(sed 's/\([^,]\)"\([^,]\)/\1\2/g' <<< "$str")
[[ "$str2" = "$str" ]] && break
# if there is no change, exit the loop
str="$str2" # update "str" for next iteration
done
echo "$str"
done < "old_sample.txt"
Output:
SVCOP,"12980","20190627","1DEX","LUBE, OIL & FILTER - DEXOS 1","I","0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS ''1''","91","LANE","LANE","L","LA MERE","125.00","125.00","","0.00","0.00","0","0","0","||||||||||||||||||||||||","N"
\\([^,]\\)"\\([^,]\\)
matches a double quote which is surrounded by non -comma characters.[EDIT] If your file has CR+LF line endings, please try instead:
while IFS= read -r str; do # assign a variable "str" to a line
while true; do # infinite loop
str2=$(sed 's/\([^,]\)"\([^,]\)/\1\2/g' <<< "$str")
[[ "$str2" = "$str" ]] && break
# if there is no change, exit the loop
str="$str2" # update "str" for next iteration
done
# echo "$str" # add LF at the end of the output line
echo -ne "$str\r\n" # add CR+LF at the end of the output line
done < <(tr -d "\r" < "VehSer_NEWM11_test.txt")
# remove CR code
BTW if perl
is your option, following code will work much faster:
perl -pe '1 while s/([^,])"([^,\r])/$1$2/g' VehSer_NEWM11_test.txt
Can't do it in one command so i've made this
$ sed "s/['\"]//g; s/,/\",\"/g; s/\",\" /, /g; s/,,/,\"\",/g; s/$/\"/; s/\"//" file
SVCOP,"12980","20190627","1DEX","LUBE, OIL & FILTER - DEXOS 1","I,0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS 1","91","LANE","LANE","L,LA MERE","125.00","125.00,"",0.00","0.00","0,0","0,||||||||||||||||||||||||","N"
Or this if you need ''1''
$ sed 's/"//g; s/,/","/g; s/"," /, /g; s/,,/,"",/g; s/$/"/; s/"//' file
SVCOP,"12980","20190627","1DEX","LUBE, OIL & FILTER - DEXOS 1","I","0.4","0.4","15.95","10.80","0.00","0.00","0.00","0.00","0.00","0.00","38.03","30.17","53.98","40.97","FULL SYNTHETIC MOTOR OIL.","LUBE, OIL & FILTER - DEXOS ''1''","91","LANE","LANE","L","LA MERE","125.00","125.00","","0.00","0.00","0","0","0","||||||||||||||||||||||||","N"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.