简体   繁体   中英

How to add a fix to avoid 504 Gateway timeout error

I am getting 504 Gateway timeout error from my GET method call to another service. Recently I added a fix by increasing the timeout period but that didn't help.

This is what I have tried

   public void getUserInformation(final Integer userId) {
        HttpClient httpClient = getBasicAuthDefaultHttpClient();
        HttpGet httpGet = new HttpGet("http://xxxx/users/"+userId);
        httpGet.addHeader("userid", userid);
        httpGet.addHeader("secret", secret);
        try {
            HttpResponse response = httpClient.execute(httpGet);
            HttpEntity entity = response.getEntity();

            if (entity != null && HttpStatus.OK.value() == 
               response.getStatusLine().getStatusCode()) {
               ObjectMapper objectMapper = new ObjectMapper();
               userInfo = objectMapper.readValue(entity.getContent(), 
               UserInfo.class);
            } else {
                logger.error("Call to the service failed: response code: 
                {}", response.getStatusLine().getStatusCode());
            }
        } catch (Exception e) {
            logger.error("Exception: "+ e);
        }

   }

  public HttpClient getBasicAuthDefaultHttpClient() {
    CredentialsProvider provider = new BasicCredentialsProvider();
    UsernamePasswordCredentials creds = new 
    UsernamePasswordCredentials(user, password);
    provider.setCredentials(AuthScope.ANY, creds);

    //Fix to avoid HTTP 504 ERROR (GATEWAY TIME OUT ERROR) for ECM calls
    RequestConfig.Builder requestBuilder = RequestConfig.custom();
    requestBuilder.setConnectTimeout(30 * 1000);
    requestBuilder.setConnectionRequestTimeout(30 * 1000);

    HttpClientBuilder builder = HttpClientBuilder.create();
    builder.setDefaultRequestConfig(requestBuilder.build());
    builder.setDefaultCredentialsProvider(provider).build();

    return builder.build();
  }

I am calling this process within a loop to process records, this works for most of the records but fails for few userId's in that. But what I noticed is everything will work fine when I run only the failed records, not sure whats the problem in this case.

I thought of calling the method again when I receive 504 to invoke it again hoping to receive 200 next time.

Not sure is this the good idea. Any advice would be greatly appreciated.

According to the description of the 504 Gateway Timeout status code, it is returned when you have a chain of servers that communicate to process the request and one of the nodes (not the server you are calling but some later one) is not able to process the request in a timely fashion.

I would presume that the situation you are in could be depicted as follows.

CLIENT -> USERS SERVICE -> SOME OTHER SERVICE

The problem is that SOME OTHER SERVICE is taking too long to process your request. The USERS SERVICE gives up at some point in time and returns you this specific status code to indicate that.

As far as I know, there is little you could do to mitigate the problem. You need to get in touch with the owners of the USERS SERVICE and ask them to increase their timeout or the owners of SOME OTHER SERVICE and ask them to improve their performance.

As for why such an error could occur from time to time. It is possible that you, in combination with other clients, are transitively overloading SOME OTHER SERVICE , causing it to process requests slower and slower. Or it could be that SOME OTHER SERVICE has throttling or rate limiting enabled to prevent Denial of Service attacks. By making too many requests to the USERS SERVICE it is possible that you are consuming the quota it has.

Of course, all of these are speculations, without knowing you actual scenario.

I faced the same sometime back, below are the checks i did to resolve this. I will add more details to the above analogy.

Client-> Users Service -> Some Other Service

Client checks:

'Some Other Service' checks: If throttling/rate limiting is set to avoid DOS attacks. Then you need to increase the timeouts on Some Other Service. I used tomcat server on AWS: Changed the idle timeout in your yaml file

metadata:
    annotations:
        #below for openshift which worked for me        
        haproxy.router.openshift.io/timeout:20000
        #below for kubernetes timeout in ELB
        service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout:20000 

Also changed the connector timeout on tomcat

 <Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1" redirectPort="8443"/>

Voila. It worked for me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM