简体   繁体   中英

parsing quote and escape characters CSV file

I need to Import large sets of data into SQL, the output file (text) is UTC-8 generated from an ABAP program where I can define the quote and escape characters , by default I'm using :

\\ as escape string

" ( double quote) to quote characters

; (semi Colon) to separate the columns.

My problem resides that most of the columns which type is text contains double quotes or escape characters and when trying to import this into SQL database the interface fails because of data wrong allocation of the columns.

I manage to avoid the \\n with below python scrip but I'm struggling with the Double quotes, Can you suggest any idea to replace the double quotes inside the quoting characters?

Text fields like = banana from "Ecuador" its causing me a big mess since the data on the CSV file is stored as "banana from "Ecuador""

import csv
filename = "0180914_074626.csv"
with open(filename, 'r', encoding='utf8', errors='ignore') as inputfile, \
     open(filename + '.log.csv', 'w', encoding="utf8") as outputfile_log:
     w = csv.writer(outputfile_log, delimiter=';', quotechar='"', lineterminator='\n')
       for record in csv.reader(inputfile):
            #print(record)
            w.writerow(tuple(s.replace("\n", '-') for s in record))

Look into using BCP with a Format File.

Then you can specify that, for example, the last column is terminated by a double-quote followed by a CRLF. The other columns are terminated by double-quote followed by semi-colon.

For each column, any characters not matching the combination of characters that make up the terminator for that column will be ignored.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM