简体   繁体   中英

Snowflake - extract JSON array string object values into pipe separated values

I have a nested JSON array which is a string object which has been stored into variant type stage table and I want to extract particular string object value and populate with pipe separated values if more than one object found. Can someone help me to achieve the desired output format please.

Sample JSON data

    {"issues": [
    {
    "expand": "",
    "fields": {
        "customfield_10010": [
        "com.atlassian.xxx.yyy.yyyy.Sprint@xyz456[completeDate=2020-07-20T20:19:06.163Z,endDate=2020-07-17T21:48:00.000Z,goal=,id=1234,name=SPR-SPR 8,rapidViewId=239,sequence=1234,startDate=2020-06-27T21:48:00.000Z,state=CLOSED]",
        "com.atlassian.xxx.yyy.yyyy.Sprint@abc123[completeDate=<null>,endDate=2020-08-07T20:33:00.000Z,goal=,id=1239,name=SPR-SPR 9,rapidViewId=239,sequence=1239,startDate=2020-07-20T20:33:26.364Z,state=ACTIVE]"
        ],
        "customfield_10011": "obcd",
        "customfield_10024": null,
        "customfield_10034": null,
        "customfield_10035": null,
        "customfield_10037": null,
    },
    "id": "123456",
    "key": "SUE-1234",
    "self": "xyz"
    }]}

I don't have any idea on how to separate the string objects inside an array with snowflake. By using the below query I can get whole string converted into pipe separated values.

select
    a.value:id::number as ISSUE_ID,
    a.value:key::varchar as ISSUE_KEY,
    array_to_string(a.value:fields.customfield_10010, '|') as CF_10010_Data
from
    ABC.VARIANT_TABLE,
    lateral flatten( input => payload_json:issues) as a;

But I need to extract particular string object value. Say for example id value such as 1234 & 1239 to be populated as pipe separated as shown below.

ISSUE_ID    ISSUE_KEY   SPRINT_ID
123456      SUE-1234    1234|1239

Any idea on this to get desired result is much appreciated. Thanks..

It looks like the data within [...] for your sprints are just details about that sprint. I think it would be easiest for you to actually populate a separate sprints table with data on each sprint, and then you can join that table to the Sprint ID values parsed from the API response you showed with issues data.

with
    jira_responses as (
        select
            $1 as id,
            $2 as body
        from (values
            (1, '{"issues":[{"expand":"","fields":{"customfield_10010":["com.atlassian.xxx.yyy.yyyy.Sprint@xyz456[completeDate=2020-07-20T20:19:06.163Z,endDate=2020-07-17T21:48:00.000Z,goal=,id=1234,name=SPR-SPR 8,rapidViewId=239,sequence=1234,startDate=2020-06-27T21:48:00.000Z,state=CLOSED]","com.atlassian.xxx.yyy.yyyy.Sprint@abc123[completeDate=<null>,endDate=2020-08-07T20:33:00.000Z,goal=,id=1239,name=SPR-SPR 9,rapidViewId=239,sequence=1239,startDate=2020-07-20T20:33:26.364Z,state=ACTIVE]"],"customfield_10011":"obcd","customfield_10024":null,"customfield_10034":null,"customfield_10035":null,"customfield_10037":null},"id":"123456","key":"SUE-1234","self":"xyz"}]}')
        )
    )
select
    issues.value:id::integer as issue_id,
    issues.value:key::string as issue_key,
    get(split(sprints.value::string, '['), 0)::string as sprint_id
from jira_responses,
lateral flatten(input => parse_json(body):issues) issues,
lateral flatten(input => parse_json(issues.value):fields:customfield_10010) sprints

Based on your sample data, the results would look like the following.

横向展平两次以获得 Sprint ID

See Snowflake reference docs below.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM