COPY INTO 带有额外列的雪花表

Question

I've got a table defined in Snowflake as:我在雪花中定义了一个表：

GLPCT GLPCT

BATCH_KEY NUMBER(38,0) NULL
CTACCT VARCHAR(100) NULL
CTPAGE NUMBER(38,0) NULL

and a file that looks like this:和一个看起来像这样的文件：

GLPCT.csv GLPCT.csv

CTACCT VARCHAR(100)
CTPAGE NUMBER(38,0)

example:例子：

CTACCT,CTPAGE
"Test Account",100
"Second Account", 200

My copy into command looks like this:我的复制到命令如下所示：

copy into GLPCT_POC from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

Problem问题

Snowflake is throwing an error due to a column number mismatch.由于列号不匹配，Snowflake 抛出错误。 How can I get Snowflake to ignore the column that isn't present in the file and not throw an error?如何让 Snowflake 忽略文件中不存在的列而不抛出错误？ I can move BATCH_KEY to the end of the table if that will help.如果有帮助，我可以将BATCH_KEY移到表的末尾。

Answer 1

It appears it's possible to indicate what columns to insert into with a COPY INTO statement, so ours becomes:似乎可以使用 COPY INTO 语句指示要插入哪些列，因此我们的语句变为：

copy into GLPCT_POC (CTACCT, CTPAGE) from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

We could not use a transformation as mentioned in a previous answer, due to this being an external file.由于这是一个外部文件，我们无法使用上一个答案中提到的转换。

Answer 2

Snowflake allows you to set the ERROR_ON_COLUMN_COUNT_MISMATCH in the file format. Snowflake 允许您在文件格式中设置 ERROR_ON_COLUMN_COUNT_MISMATCH。

ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE Boolean that specifies whether to generate a parsing error if the number of delimited columns (ie fields) in an input data file does not match the number of columns in the corresponding table. FALSE 布尔值，指定如果输入数据文件中的分隔列（即字段）数与相应表中的列数不匹配，是否生成解析错误。

If set to FALSE, an error is not generated and the load continues.如果设置为 FALSE，则不会生成错误并继续加载。 If the file is successfully loaded:如果文件加载成功：

If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.如果输入文件包含的记录的字段数多于表中的列数，则匹配的字段将按文件中出现的顺序加载，而不加载其余字段。

If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.如果输入文件包含的记录的字段数少于表中的列数，则表中不匹配的列将加载为 NULL 值。

https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv

Answer 3

You can add a "transformation" as you pull data in with a copy into query.您可以在将数据与副本一起拉入查询时添加“转换”。 In this case your transformation can be to add a NULL column.在这种情况下，您的转换可以是添加一个 NULL 列。

However, in order to use this feature, you need to create a stage for your external source但是，为了使用此功能，您需要为外部源创建一个舞台

create or replace stage my_stage 
url='azure://ouraccount.blob.core.windows.net/landing/GLPCT'
credentials=(azure_sas_token='<SAS_TOKEN')
file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

copy into GLPCT_POC 
from (SELECT NULL, $1, $2 FROM @my_stage);

The $1 and $2 line up with the columns in the file, and then the order of the columns in the select clause line up with the columns of the table. $1 和 $2 与文件中的列对齐，然后 select 子句中列的顺序与表中的列对齐。

The extra benefit of this is if you are reusing that copy statement and/or stage, you don't need to have all the credential and file format information repeated.这样做的额外好处是，如果您要重用该复制语句和/或阶段，则无需重复所有凭据和文件格式信息。

See Data load with transformation syntax请参阅使用转换语法加载数据

COPY INTO 带有额外列的雪花表

问题描述

GLPCT GLPCT

GLPCT.csv GLPCT.csv

Problem问题

3 个解决方案

解决方案1
5 已采纳 2019-12-04 03:25:00

解决方案2
1 2020-08-12 15:41:33

解决方案3
0 2019-12-04 03:08:13

COPY INTO 带有额外列的雪花表

问题描述

GLPCT GLPCT

GLPCT.csv GLPCT.csv

Problem问题

3 个解决方案

解决方案1 5 已采纳 2019-12-04 03:25:00

解决方案2 1 2020-08-12 15:41:33

解决方案3 0 2019-12-04 03:08:13

解决方案1
5 已采纳 2019-12-04 03:25:00

解决方案2
1 2020-08-12 15:41:33

解决方案3
0 2019-12-04 03:08:13