简体   繁体   中英

psql import .csv - Double Quoted fields and Single Double Quote Values

Hello Stack Overflowers,

Weird question. I am having trouble importing a .csv file using psql command line arguments...

The .csv is comma delimited and there are double quotes around cells/fields that have commas in them. I run into an issue where one of the cells/fields has a single double-quote that is being used for inches. So in the example below, it thinks the bottom two rows are all one cell/field.

I can't seem to find a way to make this import correctly. I am hoping to not have to make changes to the file itself and just adjust my psql command.

Ex:
number, number, description  (Headers)
123,124,"description, description"
123,124,description, TV 55"
123,124,description, TV 50"

Command Ex:
\copy table FROM 'C:\Users\Desktop\folder\file.csv' CSV HEADER
\copy table FROM 'C:\Users\Desktop\folder\file.csv' WITH CSV HEADER QUOTE '"' ESCAPE '\' 

I've noticed saving using excel fixes the issue... Excel formats the records like...

number, number, description  (Headers)
123,124,"description, description"
123,124,"description, TV 55"""
123,124,"description, TV 50"""

I don't want to save using excel though because I have numbers that are turned into scientific notation and leading zeros are dropped immediately upon opening the file in excel.

It's an ugly hack, but you can import into a single-column table with \\copy table from '/path/to/file' CSV quote e'\\x01' delimiter e'\\x02' and then try to fix it in SQL with regex functions. This is only workable with reasonably small CSVs since you're duplicating the data in the single-column table while doing the import.

testdb=# create table import_data(t text);
CREATE TABLE
testdb=# \! cat /tmp/oof.csv
num0,num1,descrip
123,124,"description, description"
123,124,description, TV 55"
123,124,"description, TV 50""
testdb=# \copy import_data from /tmp/oof.csv csv header quote e'\x01' delimiter e'\x02'
COPY 3
testdb=# CREATE TABLE fixed AS
SELECT
  (regexp_split_to_array(t, ','))[1] num1,
  (regexp_split_to_array(t, ','))[2] num2,
  regexp_replace(
        regexp_replace(regexp_replace(t, '([^,]+,[^,]+),(.*)', '\2'),
                       '"(.*?)"', '\1'),
        '(.*)(")?', '\1\2') as descrip
FROM import_data;
SELECT 3
testdb=# select * from fixed;
 num1 | num2 |         descrip          
------+------+--------------------------
 123  | 124  | description, description
 123  | 124  | description, TV 55"
 123  | 124  | description, TV 50"
(3 rows)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM