简体   繁体   中英

AWS Lambda REST Call Timeouts

We use Lambda to provide a sandboxed environment in which customers can run custom JS scripts. A custom SDK is exposed in the script execution context. One of the allowed operation by the SDK is to execute REST calls to external resources. The REST call timeout is set to 60000 ms. If a request takes longer than 60 seconds we manually abort the request.

requestTimeoutRef = setTimeout(() => {
  requestTimeoutRef = null;
  request.abort();
  response.error = "[RestSystem] call > timeout on method:${method}; url: ${url}; waited for ${restTimeout}ms";
  callback(response.error, response.result);
}, 60000);

Timeout configuration in Rest system - Lambda

If the request is resolved we clear the timeout either on success or on error.

Lambda setup:

Lambda is configured to run inside the VPC on AWS and it has assigned two private su.nets. It also assigned to a security group which has the ephemeral ports opened.

Node version: 14.xx

Lambda timeout: 3 min

Current situations / scenarios:

REST calls are used intensively most of the time to our API. We are noticing occasional timeouts in our production environment to different endpoints, internal or external.

Steps taken:

First step was to figure out at which step the request hangs by logging outgoing requests on Lambda and incoming requests on our API. The conclusion was that the requests never reaches our API. So, our debugging focus switched to Lambda.

We are logging most of the request lifecycle events in Lambda:

  • Request:

    • Error
    • Socket ('connect', 'close')
    • End
  • Response:

    • Data
    • End
    • Error
const request = communicationModule.request(options, function (result) {
  result.setEncoding('utf8'):
  result.on('data', function (chunk) {
    // ... 
  });
  result.on('end', function () {
    // ...
  });
  result.on('error', function (e) {
    // ...
  });
});

requestTimeoutRef = setTimeout(() => {
  requestTimeoutRef = null:
  request.abort():
  response.error = " [RestSystem] call > timeout on method: ${method}: url: ${url}: waited for ${restTimeout}ms";

  log. message('[RestSystem] call > Request ${requestId} to ${url}} failed with error: exception: waited for ${restTimeout}ms.');

  callback(response.error, response.result);
}, 60000) ;


request.on( 'error', function (err) {
  // ...
});

request.on ('socket', socket => {
  log.message(`[RestSystem] call > Request ${requestId} socket connecting "${socket.connecting}" socket reused: ${request. reusedSocket}.`);

  socket.on('connect', () => {
    log.message(`[RestSystem] call > Request ${requestId} socket connection.`);

  socket.on('close', () => f
    log.message(`[RestSystem] call > Request ${requestId} closing socket.`);
}):

function stopRequestTimeout() {
  clearTimeout(requestTimeoutRef):
  requestTimeoutRef = null:
}

Rest System Logs in Lambda

As can be seen in the screenshot below, the last logging is when the socket is assigned to the request. After 60 seconds, we see the timeout error, without socket connection attempt. 在此处输入图像描述 CloudWatch logs

Second step was to check the Lambda configuration ( VPC, su.net and security groups ) which seem fine.

Lastly, we updated the Lambda node version to 14 and node packages to latest compatible version.

Reproduction steps:

We have not managed to get a reliable reproduction of the issue since it only occurs on production where there is load on Lambda.

We have not identified any patterns in the requests which are failing.

If this is happening to a significate level I would raise a support ticket to AWS. If it's occasionally I would put it down to the fact that failures do happen, and that you need to implement a retry strategy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM