简体   繁体   中英

Azure Synapse Analytics Json Flatten

I'm new to Azure Synapse and currently have the following problem:

I get a json that looks like the following:

{
"2022-02-01":[
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    },
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    }
],
  "2022-02-02": [
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    }
],
"2022-02-03": [
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    },
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    }
],
  "2022-02-4": []
}

Now I would like to convert this so that I get it in a view. I have already tried with a dataflow as array of documents but I get an error.

"Malformed records are detected in schema inference. Parse Mode: FAILFAST"

I want something like:

date         shiftId   employeeId   duration
___________|_________|____________|_________
2022-02-01 | 1234    | 345345     | 420
2022-02-01 | 2345    | 345345     | 124
2022-02-02 | 5345    | 123567     | 424
2022-02-03 | 5675    | 987542     | 123
2022-02-03 | 9456    | 234466     | 754

Azure Synapse Analytics, dedicated SQL pools are actually very capable with JSON, supporting OPENJSON and JSON_VALUE , so you could just use a Stored Procedure with the JSON as a parameter. A simple exmaple:

SELECT
    k.[key] AS [shiftDate],
    JSON_VALUE( d.[value], '$.shiftId' ) shiftId,
    JSON_VALUE( d.[value], '$.employeeId' ) employeeId,
    JSON_VALUE( d.[value], '$.duration' ) duration
FROM OPENJSON( @json, '$' ) k
    CROSS APPLY OPENJSON( k.value, '$' ) d;

The full code:

DECLARE @json NVARCHAR(MAX) = '{
    "2022-02-01": [
        {
            "shiftId": 1234,
            "employeeId": 345345,
            "duration": 420
        },
        {
            "shiftId": 2345,
            "employeeId": 345345,
            "duration": 124
        }
    ],
    "2022-02-02": [
        {
            "shiftId": 5345,
            "employeeId": 123567,
            "duration": 424
        }
    ],
    "2022-02-03": [
        {
            "shiftId": 5675,
            "employeeId": 987542,
            "duration": 123
        },
        {
            "shiftId": 9456,
            "employeeId": 234466,
            "duration": 754
        }
    ]
}'


SELECT
    k.[key] AS [shiftDate],
    JSON_VALUE( d.[value], '$.shiftId' ) shiftId,
    JSON_VALUE( d.[value], '$.employeeId' ) employeeId,
    JSON_VALUE( d.[value], '$.duration' ) duration
FROM OPENJSON( @json, '$' ) k
    CROSS APPLY OPENJSON( k.value, '$' ) d;

My results:

在此处输入图像描述

You could use a Synapse Notebook or Mapping Data Flows if you wanted something more dynamic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM