简体   繁体   中英

Snowpipe: loading only the first row

Hello dear community,

I face an issue that I need to load only first row (header) of the upcoming data in the table via snowpipe. Could you please support me with the correct parameter? I guess, it should be specified somehow in FILE_FORMAT, for example, now I have this file format

FILE_FORMAT = (TYPE = CSV
             --COMPRESSION = GZIP
             --SKIP_HEADER=1
             --FIELD_OPTIONALLY_ENCLOSED_BY = '"'
             VALIDATE_UTF8 = FALSE
             FIELD_DELIMITER  = '|'
             ESCAPE_UNENCLOSED_FIELD = NONE
             DATE_FORMAT      = 'YYYYMMDD'
             TIMESTAMP_FORMAT = 'DD.MM.YYYY HH24:MI:SS'
             TRIM_SPACE       = TRUE
           );

I don't think there is a way to do this in one step.

However, you could load the full file, and add the metadata value METADATA$FILE_ROW_NUMBER to your insert into the table. See this page

Then as a second step, you could query that table where the FILE_ROW_NUMBER = 1. And you could easily write that as a CREATE TABLE AS SELECT... or INSERT INTO TABLE SELECT ... depending on what you are trying to do.

If the objective is to avoid the processing of the full file, then I'm not sure there's an option within Snowflake, but you could add a step prior to loading the file to create a different file that only has the first record and load that instead? Probably not the most helpful advice, but maybe worth considering.

A snowpipe consists of 2 distinct steps:

  1. Pipe (load): loads a file from a stage (into memory)
  2. Copy into: saves data from memory into a table.

You cannot alter the records loaded by the pipe stage, as mentioned by previous answer. You can lightly alter the records which are written into a table in the copy stage . Here's Snowflake's documentation on including metadata .

CREATE PIPE
    my_pipe
    AS
    COPY INTO 
      my_table(filename, file_row_number, col1, col2)
      FROM (
          SELECT
            metadata$filename,
            metadata$file_row_number,
            $1, $2
          from @snowpipe_db.public.mystage
      )
;

Thereafter, you can create an intermediate table which loads records with file_row_number=1 .

As an aside, if you are concerned with data security, you should consider truncating the rows in your stage, prior to being loaded by Snowflake altogether.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM