简体   繁体   中英

Apache HttpClient and remote files in URL with https scheme

I'm using version 4.2.5. of AutoRetryHttpClient from org.apache.httpcomponents to download a pdf file from an url whose scheme is https . The code is written in NetBeans 7.3 and uses JDK7.

Supposing that the imaginary pdf resource is at https://www.thedomain.with/my_resource.pdf , then I have the following code:

SchemeRegistry registry = new SchemeRegistry();
    try {
        final SSLSocketFactory sf = new SSLSocketFactory(new TrustStrategy() {
            @Override
            public boolean isTrusted(X509Certificate[] chain, String authType)
                    throws CertificateException {
                return true;
            }
        });

        registry.register(new Scheme("https", 3920, sf));            
    } catch (NoSuchAlgorithmException | KeyManagementException | KeyStoreException | UnrecoverableKeyException ex) {
        Logger.getLogger(HttpConnection.class.getName()).log(Level.SEVERE, null, ex);
    }        
    //Here I create the client.
    HttpClient client = new AutoRetryHttpClient(new DefaultHttpClient(new PoolingClientConnectionManager(registry)),
            new DefaultServiceUnavailableRetryStrategy(5, //num of max retries
               100//retry interval)); 

        HttpResponse httpResponse = null;
        try {
            HttpGet httpget = new HttpGet("https://www.thedomain.with/my_resource.pdf");
            //I set header and Mozilla User-Agent
            httpResponse = client.execute(httpget);
        } catch (IOException ex) {
        }
        ... //other lines of code to get and save the file, not really important since the code is never reached

When I call client.execute the following exception is thrown

org.apache.http.conn.HttpHostConnectException: Connection to https://www.thedomain.with refused

What can I do to get that pdf resource?

PS: I can download it via browser, so exists a way to obtain that file.

There seem to be a couple of problems:

  • You registered the Scheme to use 3920 as the default port, which is a non-standard port number for HTTPS. If the server is actually running on that port, then you would have to access using this URL in the browser: https://www.thedomain.with:3920/my_resource.pdf . Since the URL that you use in the browser does not include the 3920 port, then the server will be running on the default port of 443, so you should use change new Scheme("https", 3920, sf) to new Scheme("https", 443, sf) .
  • It appears that the CN in your server's certificate doesn't match its hostname, which is causing the SSLPeerUnverifiedException . In order for this to work, you would need to use the SSLSocketFactory(TrustStrategy, HostnameVerifier) constructor and pass a verifier that doesn't do this check. Apache provides the AllowAllHostnameVerifier for this purpose.

Note: You really shouldn't use the no-op TrustStrategy and HostnameVerifier in production code, as this essentially turns off all security checks in terms of authenticating the remote server and leaves you open to impersonation attacks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM