简体   繁体   中英

Snowflake how can we loop over each row of a temp table and insert its values with into another table where each field with its value is a single row?

We are loading data into a fact table, we the original temporary table on Snowflake looks like the following:

在此处输入图像描述

Where indicator_nbr fields are questions asked within a survey.

We are using data modelling techniques in building our warehouse database, so the data will be added into a fact table like so:

在此处输入图像描述

Then the same for the indicator 2 and 3 and so on if there is other questions.

Each Field with its value will be as a single row. Of course there is other metadata to be added like load_dt and record_src but they are not a problem.

The current script is doing the following:

  1. Get the fields into an array => fields_array = ['indicator_1', 'indicator_2', 'indicator_3']

  2. A loop will run over the array and start adding each field with its value for each row. So imagine we are having 100 rows, we will run 300 inserts, one at a time:

    for (var col_num = 0; col_num<fields_array.length; col_num = col_num+1) {

     var COL_NAME = fields_array[col_num]; var field_value_query = "INSERT INTO SAT_FIELD_VALUE SELECT md5(id), CURRENT_TIMESTAMP(), NULL, 'SRC', "+COL_NAME+", md5(foreign_key_field) FROM "+TEMP_TABLE_NAME+"";

    }

As mentioned in the comment on this post showing the full script, it is better to loop over a string concatenating each from values of the insert query.

There is 2 issues of the suggested solution:

  1. There is a size limit of a query on Snowflake (it should be less than 1 MB);
  2. if we are going to loop over each field and concatenate the from values , we should do a select query as well from the temp table to get the value of the column, so there will be no optimization, or we will reduce the time a little bit but not to much.

EDIT: A possible solution

I was thinking of doing an sql query selecting everything from the temp table, and do hashing and everything and save it into an array after transposing, but I have no idea how to do it.

Not sure if this is what you're looking for but it seems as though you just want to do a pivot:

Setup example scenario

create or replace transient table source_table
(
    id          number,
    indicator_1 varchar,
    indicator_2 number,
    indicator_3 varchar
);

insert overwrite into source_table
values (1, 'Test', 2, 'DATA'),
       (2, 'Prod', 3, 'DATA'),
       (3, 'Test', 1, 'METADATA'),
       (4, 'Test', 1, 'DATA')
;

create or replace transient table target_table
(
    hash_key varchar,
    md5      varchar
);

Run insert

select
    name_col as hash_key,
    md5(id)
from (select
          id,
          indicator_1,
          indicator_2::varchar as indicator_2,
          indicator_3
      from source_table) unpivot ( val_col for name_col in (indicator_1, indicator_2, indicator_3))
;

This results in a target_table that looks like this:

+-----------+--------------------------------+
|HASH_KEY   |MD5                             |
+-----------+--------------------------------+
|INDICATOR_1|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_2|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_3|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_1|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_2|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_3|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_1|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_2|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_3|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_1|a87ff679a2f3e71d9181a67b7542122c|
|INDICATOR_2|a87ff679a2f3e71d9181a67b7542122c|
|INDICATOR_3|a87ff679a2f3e71d9181a67b7542122c|
+-----------+--------------------------------+

It is great scenario to use INSERT ALL :

INSERT ALL
INTO dst_tab(hash_key, md5) VALUES (indicator_1, md5)
INTO dst_tab(hash_key, md5) VALUES (indicator_2, md5)
INTO dst_tab(hash_key, md5) VALUES (indicator_3, md5) 
SELECT MD5(id) AS md5, indicator_1, indicator_2::STRING AS indicator_2, indicator_3 
FROM src_tab;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM