I've got a table defined in Snowflake as:
BATCH_KEY NUMBER(38,0) NULL
CTACCT VARCHAR(100) NULL
CTPAGE NUMBER(38,0) NULL
and a file that looks like this:
CTACCT VARCHAR(100)
CTPAGE NUMBER(38,0)
example:
CTACCT,CTPAGE
"Test Account",100
"Second Account", 200
My copy into command looks like this:
copy into GLPCT_POC from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');
Snowflake is throwing an error due to a column number mismatch. How can I get Snowflake to ignore the column that isn't present in the file and not throw an error? I can move BATCH_KEY
to the end of the table if that will help.
It appears it's possible to indicate what columns to insert into with a COPY INTO statement, so ours becomes:
copy into GLPCT_POC (CTACCT, CTPAGE) from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');
We could not use a transformation as mentioned in a previous answer, due to this being an external file.
Snowflake allows you to set the ERROR_ON_COLUMN_COUNT_MISMATCH in the file format.
ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE Boolean that specifies whether to generate a parsing error if the number of delimited columns (ie fields) in an input data file does not match the number of columns in the corresponding table.
If set to FALSE, an error is not generated and the load continues. If the file is successfully loaded:
If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.
If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.
https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv
You can add a "transformation" as you pull data in with a copy into query. In this case your transformation can be to add a NULL column.
However, in order to use this feature, you need to create a stage for your external source
create or replace stage my_stage
url='azure://ouraccount.blob.core.windows.net/landing/GLPCT'
credentials=(azure_sas_token='<SAS_TOKEN')
file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');
copy into GLPCT_POC
from (SELECT NULL, $1, $2 FROM @my_stage);
The $1 and $2 line up with the columns in the file, and then the order of the columns in the select clause line up with the columns of the table.
The extra benefit of this is if you are reusing that copy statement and/or stage, you don't need to have all the credential and file format information repeated.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.