简体   繁体   English

如何合并来自雪花中两个不同表的变体列数据

[英]How to merge variant column data from two different tables in snowflake

I have two tables with around thousands of column each.Table contains variant data columns.I want to merge the variant data column based on key value in variant data.我有两个表,每个表大约有数千列。表包含变体数据列。我想根据变体数据中的键值合并变体数据列。

eg:例如:

Table 1 column name: SRC value : {col1:val1,col2:val2,col3:val3}表1列名:SRC值:{col1:val1,col2:val2,col3:val3}

Table 2: column name: SRC value : {col1:val1,col2:val2,col4:val4}表 2:列名:SRC 值:{col1:val1,col2:val2,col4:val4}

I want output after merging : Table 1 : SRC = {col1:val1,col2:val2,col3:val3,col4:val4}我想要合并后的输出:表 1:SRC = {col1:val1,col2:val2,col3:val3,col4:val4}

I want existing keys in variant column to update and new keys to insert in table.我希望变体列中的现有键更新,并将新键插入表中。

I have tried it with object_insert() method but it can update a single field at a time and it is being hard to implement syntactically for thousand of columns.How to achieve this with multiple fields.我已经用object_insert()方法尝试过它,但它可以一次更新一个字段,并且很难在语法上实现数千列。如何使用多个字段实现这一点。 Can anyone help me on this?谁可以帮我这个事?

If you know the keys in advance, and the variants are objects, then this would work:如果您事先知道键,并且变体是对象,那么这将起作用:

with 
a as ( select 1 id, parse_json('{"col1":"val1","col2":"val2","col3":"val3"}')::variant src),
b as ( select 1 id, parse_json('{"col1":"val1","col2":"val2","col4":"val4"}')::variant src)
select 
  object_construct(
    'col1', coalesce(b.src:col1,a.src:col1), 
    'col2', coalesce(b.src:col2,a.src:col2), 
    'col3', coalesce(b.src:col3,a.src:col3), 
    'col4', coalesce(b.src:col4,a.src:col4)
  ) as src
from a left join b on a.id=b.id;

The output is:输出是:

SRC              
-----------------
{   "col1": "val1",   "col2": "val2",   "col3": "val3",   "col4": "val4" }

If you don't know the keys in advance, you could flatten the objects, join on the keys, coalesce() the values, and use object_agg() to assemble the combined objects.如果您事先不知道键,您可以展平对象,连接键,合并() 值,并使用 object_agg() 组合组合的对象。

Hope that helps希望有帮助

You can use a Javascript UDF to do this.您可以使用 Javascript UDF 来执行此操作。 Here is a naive example that you can build on that does a very simple shallow merge:这是一个简单的示例,您可以在它的基础上进行非常简单的浅层合并:

-- Create an example table
create or replace transient table test_table (column1 VARIANT, column2 VARIANT);

-- Insert some sample data
INSERT OVERWRITE INTO test_table (column1, column2)
select PARSE_JSON('{"a": "row1_val_a", "c": "row1_val_c"}'), PARSE_JSON('{"b": "row1_val_b"}')
union all
select PARSE_JSON('{"a": "row2_val_a", "b": "row2_val_b"}'), PARSE_JSON('{"c": "row2_val_c"}')
;

-- Have a look at the table
select * from test_table;

-- Create the javascript UDF that merges the two columns
CREATE OR REPLACE FUNCTION merge_json(OBJ_1 VARIANT, OBJ_2 VARIANT)
    RETURNS variant
    LANGUAGE JAVASCRIPT
    AS 
    $$
    function extend(obj, src) {
      for (var key in src) {
          if (src.hasOwnProperty(key)) obj[key] = src[key];
      }
      return obj;
    }
    return extend(OBJ_1, OBJ_2)
    $$;

-- Use the UDF in a SQL query
select merge_json(column1, column2) from test_table;

This example assumes that the VARIANT objects are in the same table just in separate columns.此示例假定VARIANT对象位于同一个表中,只是在不同的列中。 You could change it to grab the Variants from different tables pretty easily.您可以更改它以非常轻松地从不同的表中获取变体。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM