简体   繁体   中英

How to resolve HttpStatusCode 429 "TOO_MANY_REQUESTS" in Azure Data Factory?

I am using ADF's Copy Activity and REST API linked service to make REST API calls and fetch json data from API links. Individual API links works fine but when I try for multiple API links in a ForEach activity , some of the links fail to get data due to HttpStatusCode 429 which is "TOO_MANY_REQUESTS" .

Q: How to resolve this issue by adding delay or something similar so that my API calls won't exceed the limit?

Refer the error image here: https://i.stack.imgur.com/4DJDe.jpg

Note: The API quota limitation is 10 requests per minute.

Solutions I tried:

  • [Worked] Added retry and retry interval (60 sec.) in Copy Activity and after that it fail's for the first time but succeeds in the second attempt to pull the data. Image Link: https://i.stack.imgur.com/TVvD5.jpg

  • [Didn't Worked] Marked ForEach activity's execution to sequential to avoid sending all request at once but doesn't seem to work as I get the same error.

ForEach activity loops through this input parameters:

...
    {
        "sourceBaseURL":"http://api.xyz.io",
        "sourceRelativeURL":"abcde/abc",
        "sinkFileName":"test/file_name1.json",
        "requestBody": "{\"abc\": [1,2,3]}"
    },
    {
        "sourceBaseURL":"http://api.xyz.io",
        "sourceRelativeURL":"abcde/abc/cde",
        "sinkFileName":"test/file_name2.json",
        "requestBody": "{\"abc\": [1,2,3]}"
    },
...

But I wanted to know is there any way to add some delay as a header to avoid this limit?

  • Ex. Added Retry-After: 60 as a header but this didn't worked.

Image Reference: https://i.stack.imgur.com/yzKFf.jpg

JSON of my configurations in Copy Acitivity:

"isSequential": true,
                    "activities": [
                        {
                            "name": "INGEST_API",
                            "type": "Copy",
                            "dependsOn": [],
                            "policy": {
                                "timeout": "7.00:00:00",
                                "retry": 0,
                                "retryIntervalInSeconds": 30,
                                "secureOutput": false,
                                "secureInput": false
                            },
                            "userProperties": [],
                            "typeProperties": {
                                "source": {
                                    "type": "RestSource",
                                    "httpRequestTimeout": "00:01:40",
                                    "requestInterval": "00.00:00:00.010",
                                    "requestMethod": "POST",
                                    "requestBody": {
                                        "value": "@item().requestBody",
                                        "type": "Expression"
                                    },

Since your limit is 10 req per min. You should choose a number < 10 say 8 req per min as hit rate to API.

So the pattern is you send 8 requests to API (non sequentially) from a pipeline. This pipeline can only send 8 requests using a for loop. Have a retry policy of 60 seconds as well.

After the for loop add a wait activity for 60 seconds and then finishes.

This pipeline is called from another pipeline which chooses a batch of 8 requests and call this pipeline inside a loop. The execute pipeline strategy is to wait till completion.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM