简体   繁体   中英

Azure Data Factory V2 Copy Activity with Rest API giving one row for nested JSON

I am trying to flatten a nested JSON returned from a Rest source. The pipeline code is as follows. The problem here is this pipeline returns only first object from JSON dataset and skips all the rest of the rows. Can you please guide me on how to iterate over nested objects.

Thanks Sameet

{
    "name": "STG_NCR2",
    "properties": {
        "activities": [
            {
                "name": "Copy data1",
                "type": "Copy",
                "dependsOn": [],
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "source": {
                        "type": "RestSource",
                        "httpRequestTimeout": "00:01:40",
                        "requestInterval": "00.00:00:00.010",
                        "requestMethod": "GET",
                        "additionalHeaders": {
                            "OData-MaxVersion": "4.0",
                            "OData-Version": "4.0",
                            "Prefer": "odata.include-annotations=*"
                        }
                    },
                    "sink": {
                        "type": "AzureSqlSink"
                    },
                    "enableStaging": false,
                    "translator": {
                        "type": "TabularTranslator",
                        "mappings": [
                            {
                                "source": {
                                    "path": "$['value'][0]['tco_ncrid']"
                                },
                                "sink": {
                                    "name": "NCRID"
                                }
                            },
                            {
                                "source": {
                                    "path": "['tco_name']"
                                },
                                "sink": {
                                    "name": "EquipmentSerialNumber"
                                }
                            }
                        ],
                        "collectionReference": "$['value'][0]['tco_ncr_tco_equipment']"
                    }
                },
                "inputs": [
                    {
                        "referenceName": "Rest_PowerApps_NCR",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "Prestaging_PowerApps_NCREquipments",
                        "type": "DatasetReference"
                    }
                ]
            }
        ],
        "annotations": []
    }
}

The JSON is in the following format

[ 
   { 
      "value":[ 
         { 
            "tco_ncrid":"abc-123",
            "tco_ncr_tco_equipment":[ 
               { 
                  "tco_name":"abc"
               }
            ]
         },
         { 
            "tco_ncrid":"abc-456",
            "tco_ncr_tco_equipment":[ 
               { 
                  "tco_name":"xyz"
               },
               { 
                  "tco_name":"yzx"
               }
            }
         ]
      ]
   }
]

This can be resolved by amending the translator property as follows.

"translator": {
                    "type": "TabularTranslator",
                    "mappings": [
                        {
                            "source": {
                                "path": "$.['value'][0].['tco_ncrid']"
                            },
                            "sink": {
                                "name": "NCRID",
                                "type": "String"
                            }
                        },
                        {
                            "source": {
                                "path": "$.['value'][0].['tco_text_id']"
                            },
                            "sink": {
                                "name": "EquipmentDescription",
                                "type": "String"
                            }
                        },
                        {
                            "source": {
                                "path": "['tco_name']"
                            },
                            "sink": {
                                "name": "EquipmentSerialNumber",
                                "type": "String"
                            }
                        }
                    ],
                    "collectionReference": "$.['value'][*].['tco_ncr_tco_equipment']"
                }

This code forces the pipeline to iterate over nested array but as you can see that the NCRID is hardcoded to first element of the value array. This is not exactly what I want as I am looking for all Equipment Serial Numbers against every NCRID. Still researching...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM