简体   繁体   中英

making multiple http request efficiently

I want to make a few million http request to web service of the form- htp://(some ip)//{id}

I have the list of ids with me. Simple calculation has shown that my java code will take around 4-5 hours to get the data from the api The code is

URL getUrl = new URL("http url");
URLConnection conn = getUrl.openConnection();
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
StringBuffer sbGet = new StringBuffer();
String getline;
while ((getline = rd.readLine()) != null)
{
    sbGet.append(getline);
}
rd.close();
String getResponse = sbGet.toString();

Is there a way to more efficiently make such requests which will take less time

One way is to use an executor service with a fixed thread pool (the size depends how much the target HTTP service can handle) and bombard requests to the service in parallel. The Runnable would basically perform the steps you outlined in your sample code, btw.

You need to profile your code before you start optimizing it. Otherwise you may end up optimizing the wrong part. Depending on the results you obtain from profiling consider the following options.

  • Change the protocol to allow you to batch the requests
  • Issue multiple requests in parallel (use multiple threads or execute multiple processes in parallel; see this article )
  • Cache previous results to reduce the number of requests
  • Compress the request or the response
  • Persist the HTTP connection

Is there a way to more efficiently make such requests which will take less time?

Well you probably could run a small number of requests in parallel, but you are likely to saturate the server. Beyond a certain number of requests per second, the throughput is likely to degrade ...

To get past that limit, you will need to redesign the server and/or the server's web API. For instance:

  • Changing your web API to allows a client to fetch a number of objects in each request will reduce the request overheads.

  • Compression could help, but you are trading off network bandwidth for CPU time and/or latency. If you have a fast, end-to-end network then compression might actually slow things down.

  • Caching helps in general, but probably not in your use-case. (You are requesting each object just once ...)

  • Using persistent HTTP connections avoids the overhead of creating a new TCP/IP connection for each request, but I don't think you can't do this for HTTPS. (And that's a shame because HTTPS connection establishment is considerably more expensive.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM