繁体   English   中英

为什么Snowflake在转换为展平列表时会改变JSON值的顺序?

[英]why Snowflake changing the order of JSON values when converting into flatten list?

我有JSON对象存储在表中,我正在尝试编写一个查询来获取该JSON中的第一个元素。

复制脚本

create table staging.par.test_json (id int, val varchar(2000)); 

insert into staging.par.test_json values (1, '{"list":[{"element":"Plumber"},{"element":"Craft"},{"element":"Plumbing"},{"element":"Electrics"},{"element":"Electrical"},{"element":"Tradesperson"},{"element":"Home services"},{"element":"Housekeepings"},{"element":"Electrical Goods"}]}');
insert into staging.par.test_json values (2,'
  {
    "list": [
      {
        "element": "Wholesale jeweler"
      },
      {
        "element": "Fashion"
      },
      {
        "element": "Industry"
      },
      {
        "element": "Jewelry store"
      },
      {
        "element": "Business service"
      },
      {
        "element": "Corporate office"
      }
    ]
  }');



with cte_get_cats AS
(
select id, 
       val as category_list 
       from staging.par.test_json
),
cats_parse AS
(
  select id,
         parse_json(category_list) as c
  from cte_get_cats
),
distinct_cats as
(
  select id,
         INDEX,
         UPPER(cast(value:element AS varchar)) As c
  from 
      cats_parse,
      LATERAL flatten(INPUT => c:"list")
  order by 1,2 
) ,
cat_array AS
    (
        SELECT  
            id,
            array_agg(DISTINCT c) AS sds_categories
        FROM
            distinct_cats
        GROUP BY 1
    ),
sds_cats AS
( 
         select id,
         cast(sds_categories[0] AS varchar) as sds_primary_category
         from cat_array
)
select * from sds_cats;

价值观:类别

{"list":[{"element":"Plumber"},{"element":"Craft"},{"element":"Plumbing"},{"element":"Electrics"},{"element":"Electrical"},{"element":"Tradesperson"},{"element":"Home services"},{"element":"Housekeepings"},{"element":"Electrical Goods"}]}

将它展平到列表给了我

["Plumber","Craft","Plumbing","Electrics","Electrical","Tradesperson","Home services","Housekeepings","Electrical Goods"]

问题:这个顺序并不总是一样的。 Snowflake似乎改变了顺序,有时候雪花会按字母顺序改变顺序。 我怎样才能使这个静态。 我不希望订单被更改。

问题是你使用ARRAY_AGG的方式:

        array_agg(DISTINCT c) AS sds_categories

像这样指定它给Snowflake没有关于如何安排数组内容的指导。 应该假设阵列将在同一顺序作为其输入记录创建-这可能,但它不能保证。 所以你可能想做

        array_agg(DISTINCT c) within group (order by index) AS sds_categories

但这不起作用,就好像你使用DISTINCT c ,每个cindex值都是未知的。 也许你不需要DISTINCT ,那么这将有效

        array_agg(c) within group (order by index) AS sds_categories

如果确实需要DISTINCT ,则需要以某种方式将index与不同的c值相关联。 一种方法是在输入中对index使用MIN函数。 这是一个完整的查询

with cte_get_cats AS
(
select id, 
       val as category_list 
       from staging.par.test_json
),
cats_parse AS
(
  select id,
         parse_json(category_list) as c
  from cte_get_cats
),
distinct_cats as
(
  select id,
         MIN(INDEX) AS index,
         UPPER(cast(value:element AS varchar)) As c
  from 
      cats_parse,
      LATERAL flatten(INPUT => c:"list")
  group by 1,3 
) ,
cat_array AS
    (
        SELECT  
            id,
            array_agg(c) within group (order by index) AS sds_categories
        FROM
            distinct_cats
        GROUP BY 1
    ),
sds_cats AS
( 
         select id,
         cast(sds_categories[0] AS varchar) as sds_primary_category
         from cat_array
)
select * from cat_array;

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM