简体   繁体   中英

Parsing JSON without key names to retrieve a column

I am loading json from data.gov that does not have key names for the values in the json data, eg below: the metadata is available separately.

I am able to load the json into a variant column, but cannot see how to parse and query for specific columns, eg Frankford below - I have tried JSONcol:data[0] which returns the entire entry, but am unable to see how to specify column 4, say.

  {
    data: [ [ "row-ea6u~fkaa~32ry", "0B8F94EE5292", 0, 1486063689, null, 1486063689, null, "{ }", "410", "21206", "Frankford", "2", "NORTHEASTERN", [ "{\"address\": \"4509 BELAIR ROAD\", \"city\": \"Baltimore\", \"state\": \"MD\", \"zip\": \"\"}", null, null, null, true ], null, null, null ]]
    }

The following code is used to create and load the snowflake table:

create or replace table snowpipe.public.snowtable(jsontext variant);

copy into snowpipe.public.snowtable
    from @snowpipe.public.snowstage
    file_format = (type = 'JSON')

Not exactly sure how your varient data is look once you have loaded it, but experimenting on variant via PARSE_JSON for you object. Which I has to double slash the \ to make it valid sql.

select 
    PARSE_JSON('{ data: [ [ "row-ea6u~fkaa~32ry", "0B8F94EE5292", 0, 1486063689, null, 1486063689, null, "{ }", "410", "21206", "Frankford", "2", "NORTHEASTERN", [ "{\\"address\\": \\"4509 BELAIR ROAD\\", \\"city\\": \\"Baltimore\\", \\"state\\": \\"MD\\", \\"zip\\": \\"\\"}", null, null, null, true ], null, null, null ]]}') as j
    ,j:data as jd
    ,jd[0] as jd0
    ,jd0[3] as jd0_3
    ,array_slice(j:data[0],3,5) as jd0_3to4
;

shows that you can use [0] notation to index arrays, and thus get the results:

J: { "data": [ [ "row-ea6u~fkaa~32ry", "0B8F94EE5292", 0, 1486063689, null, 1486063689, null, "{ }", "410", "21206", "Frankford", "2", "NORTHEASTERN", [ "{\"a...

JD: [ [ "row-ea6u~fkaa~32ry", "0B8F94EE5292", 0, 1486063689, null, 1486063689, null, "{ }", "410", "21206", "Frankford", "2", "NORTHEASTERN", [ "{\"address\": \"4509 BELAIR ROAD\", \"city\": \"...

JD0: [ "row-ea6u~fkaa~32ry", "0B8F94EE5292", 0, 1486063689, null, 1486063689, null, "{ }", "410", "21206", "Frankford", "2", "NORTHEASTERN", [ "{\"address\": \"4509 BELAIR ROAD\", \"city\": \"Baltimore\", \"state\": \"MD\", \"...

JD0_3: 1486063689

JD0_3TO4: [ 1486063689, null ]

so if you have unknown amount of first level elements in data that you want to access, then use LATERAL FLATTEN like so:

WITH data as (
    select PARSE_JSON('{ data: [ [ "row-1", "0B8", 0 ],["row-2", "F94", 2], 
["row-3", "EE5", 4]]}') as j
)
select f.value[0]::text as row_name
    ,f.value[1]::text as serial_number
    ,f.value[2]::number as num
from data d,
lateral flatten(input=> d.j:data) f;

gives:

 ROW_NAME   SERIAL_NUMBER   NUM
 row-1      0B8             0
 row-2      F94             2
 row-3      EE5             4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM