简体   繁体   中英

PSQL import from CSV adding additional characters

I have a CSV file where I'm importing a number of fields... One of the fields is a date type field with the format of '20120401'. In the CSV file, the length of this field for all rows is 8. I created a table in Postgres and specified the field to receive this data as a DATE type column. When I imported the CSV file, it threw an "invalid input error". To work-around this, I changed the table's type to VARCHAR thinking I could run an ALTER TABLE to change the data type afterwards. The import was successful, but the ALTER TABLE wasn't. I noticed that the first row's date has a length of 9 vs. the standard 8 for all the remaining rows. Somehow, in the import it gains another character which for the life of me, I can't identify where it comes from. I've done a bunch of TRIM operations (TRIM, BTRIM) but all still yield 9 characters. Any suggestions? If I remove this one row, the ALTER TABLE statement to change it to a DATE type works. So it's really only this row.

Sample below:

20150401    My  Gll ES  1A3AE039E352    GCE 0.2461158

20150401    My  Gll ES  1F63E45849F1    GCE 0.8670354

Peering into my crystal ball, I see that it is a byte order mark (BOM) at the beginning of the file.

That would be UNICODE character U+FEFF, in UTF-8 it would be EF BB BF.

While byte order marks are useful in UTF16 encodings to determine the endianness , they are useless in UTF-8, but some operating systems use them as a marker that signifies “this file is UTF-8”.

You'll have to remove the character.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM