简体   繁体   English

COPY INTO 带有额外列的雪花表

[英]COPY INTO Snowflake Table with Extra Columns

I've got a table defined in Snowflake as:我在雪花中定义了一个表:

GLPCT GLPCT

BATCH_KEY NUMBER(38,0) NULL
CTACCT VARCHAR(100) NULL
CTPAGE NUMBER(38,0) NULL

and a file that looks like this:和一个看起来像这样的文件:

GLPCT.csv GLPCT.csv

CTACCT VARCHAR(100)
CTPAGE NUMBER(38,0)

example:例子:

CTACCT,CTPAGE
"Test Account",100
"Second Account", 200

My copy into command looks like this:我的复制到命令如下所示:

copy into GLPCT_POC from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"'); 

Problem问题

Snowflake is throwing an error due to a column number mismatch.由于列号不匹配,Snowflake 抛出错误。 How can I get Snowflake to ignore the column that isn't present in the file and not throw an error?如何让 Snowflake 忽略文件中不存在的列而不抛出错误? I can move BATCH_KEY to the end of the table if that will help.如果有帮助,我可以将BATCH_KEY移到表的末尾。

It appears it's possible to indicate what columns to insert into with a COPY INTO statement, so ours becomes:似乎可以使用 COPY INTO 语句指示要插入哪些列,因此我们的语句变为:

copy into GLPCT_POC (CTACCT, CTPAGE) from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

We could not use a transformation as mentioned in a previous answer, due to this being an external file.由于这是一个外部文件,我们无法使用上一个答案中提到的转换。

Snowflake allows you to set the ERROR_ON_COLUMN_COUNT_MISMATCH in the file format. Snowflake 允许您在文件格式中设置 ERROR_ON_COLUMN_COUNT_MISMATCH。

ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE Boolean that specifies whether to generate a parsing error if the number of delimited columns (ie fields) in an input data file does not match the number of columns in the corresponding table. FALSE 布尔值,指定如果输入数据文件中的分隔列(即字段)数与相应表中的列数不匹配,是否生成解析错误。

If set to FALSE, an error is not generated and the load continues.如果设置为 FALSE,则不会生成错误并继续加载。 If the file is successfully loaded:如果文件加载成功:

If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.如果输入文件包含的记录的字段数多于表中的列数,则匹配的字段将按文件中出现的顺序加载,而不加载其余字段。

If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.如果输入文件包含的记录的字段数少于表中的列数,则表中不匹配的列将加载为 NULL 值。

https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv

You can add a "transformation" as you pull data in with a copy into query.您可以在将数据与副本一起拉入查询时添加“转换”。 In this case your transformation can be to add a NULL column.在这种情况下,您的转换可以是添加一个 NULL 列。

However, in order to use this feature, you need to create a stage for your external source但是,为了使用此功能,您需要为外部源创建一个舞台

create or replace stage my_stage 
url='azure://ouraccount.blob.core.windows.net/landing/GLPCT'
credentials=(azure_sas_token='<SAS_TOKEN')
file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

copy into GLPCT_POC 
from (SELECT NULL, $1, $2 FROM @my_stage);

The $1 and $2 line up with the columns in the file, and then the order of the columns in the select clause line up with the columns of the table. $1 和 $2 与文件中的列对齐,然后 select 子句中列的顺序与表中的列对齐。

The extra benefit of this is if you are reusing that copy statement and/or stage, you don't need to have all the credential and file format information repeated.这样做的额外好处是,如果您要重用该复制语句和/或阶段,则无需重复所有凭据和文件格式信息。

See Data load with transformation syntax请参阅使用转换语法加载数据

复制到<div id="text_translate"><p>我正在尝试将数据从本地复制到雪花,我得到了</p><blockquote><p> snowflake.connector.errors.ProgrammingError: 001757 (42601): SQL 编译错误:表 'RAW_DATA' 不存在</p></blockquote><p>相同的代码在 Jupiter notebook 中有效,但在 vs code 中无效。 我的角色是 accountadmin,所以权限没有问题。</p><p> 我要运行的代码是这个</p><pre>COPY INTO RAW_DATA file_format=(FIELD_OPTIONALLY_ENCLOSED_BY ='"' skip_header=1)</pre></div>在雪花抛出表中不存在<table> </table> - COPY INTO <table> in snowflake throws table does not exist

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将一个表中的多个列复制到具有额外列的另一个表中 - Copy multiple columns from a table to another table with extra columns 将记录从一个 Oracle SQL 表复制到另一个具有额外列的表 - Copy record from one Oracle SQL table to another with extra columns 复制到<div id="text_translate"><p>我正在尝试将数据从本地复制到雪花,我得到了</p><blockquote><p> snowflake.connector.errors.ProgrammingError: 001757 (42601): SQL 编译错误:表 'RAW_DATA' 不存在</p></blockquote><p>相同的代码在 Jupiter notebook 中有效,但在 vs code 中无效。 我的角色是 accountadmin,所以权限没有问题。</p><p> 我要运行的代码是这个</p><pre>COPY INTO RAW_DATA file_format=(FIELD_OPTIONALLY_ENCLOSED_BY ='"' skip_header=1)</pre></div>在雪花抛出表中不存在<table> </table> - COPY INTO <table> in snowflake throws table does not exist 如何在雪花中生成表和表列的统计信息? - How to generate statistics of a table and columns of a table in snowflake? 如何在 Snowflake 中重命名表的多个列? - How to rename multiple columns of a table in Snowflake? 在 Snowflake 中翻转表中的多列和多行 - Flip a Multiple Columns and multiple rows in table in Snowflake 复制表但有新列 - Copy Table but with new Columns 如何将镶木地板文件从 Azure Blob 存储复制到雪花表中? - How to copy parquet file from Azure Blob Storage into Snowflake table? Snowflake - 并行处理大型 zip 文件并将其复制到 SF 表 - Snowflake - Parallel processing and copy of large zip file to a SF Table 尝试将数据从视图复制到雪花表中时出错 - Errors trying to copy data from a view into a table in snowflake
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM