简体   繁体   English

从CSV导入PSQL添加其他字符

[英]PSQL import from CSV adding additional characters

I have a CSV file where I'm importing a number of fields... One of the fields is a date type field with the format of '20120401'. 我有一个CSV文件,我在其中导入了许多字段...其中一个字段是日期类型字段,格式为'20120401'。 In the CSV file, the length of this field for all rows is 8. I created a table in Postgres and specified the field to receive this data as a DATE type column. 在CSV文件中,所有行的此字段的长度为8.我在Postgres中创建了一个表,并指定该字段以将此数据作为DATE类型列接收。 When I imported the CSV file, it threw an "invalid input error". 当我导入CSV文件时,它引发了“无效的输入错误”。 To work-around this, I changed the table's type to VARCHAR thinking I could run an ALTER TABLE to change the data type afterwards. 为解决这个问题,我将表的类型更改为VARCHAR,以为我可以运行ALTER TABLE来更改数据类型。 The import was successful, but the ALTER TABLE wasn't. 导入成功,但ALTER TABLE没有。 I noticed that the first row's date has a length of 9 vs. the standard 8 for all the remaining rows. 我注意到第一行的日期长度为9,而所有剩余行的标准值为8。 Somehow, in the import it gains another character which for the life of me, I can't identify where it comes from. 不知何故,在导入中它获得了另一个角色,对于我的生活,我无法确定它来自何处。 I've done a bunch of TRIM operations (TRIM, BTRIM) but all still yield 9 characters. 我做了一堆TRIM操作(TRIM,BTRIM),但仍然可以产生9个字符。 Any suggestions? 有什么建议么? If I remove this one row, the ALTER TABLE statement to change it to a DATE type works. 如果我删除这一行,ALTER TABLE语句将其更改为DATE类型。 So it's really only this row. 所以它真的只有这一行。

Sample below: 以下示例:

20150401    My  Gll ES  1A3AE039E352    GCE 0.2461158

20150401    My  Gll ES  1F63E45849F1    GCE 0.8670354

Peering into my crystal ball, I see that it is a byte order mark (BOM) at the beginning of the file. 凝视我的水晶球,我发现它是文件开头的字节顺序标记 (BOM)。

That would be UNICODE character U+FEFF, in UTF-8 it would be EF BB BF. 这将是UNICODE字符U + FEFF,在UTF-8中它将是EF BB BF。

While byte order marks are useful in UTF16 encodings to determine the endianness , they are useless in UTF-8, but some operating systems use them as a marker that signifies “this file is UTF-8”. 虽然字节顺序标记在UTF16编码中用于确定字节顺序 ,但在UTF-8中它们是无用的,但是某些操作系统将它们用作表示“此文件是UTF-8”的标记。

You'll have to remove the character. 你必须删除该角色。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM