简体   繁体   中英

A zip file download works with apache httpclient v4.3.6 but fails for other version

This is a little weird and I have done close to enough research finding the cause and resolution of this issue. My objective is to download a zip file from a secured URL that also requires login. Everything works perfect when I use apache httpClient maven dependency of version 4.3.6. However, I cannot use this version due to the fact that my aws-sdk-java-core maven dependency also has httpclient dependency and using v4.3.6 makes the aws-sdk-java complains about a NoSuchMethod runtime exception. I understand this issue. The reason is that apache httpclient v4.3.6 dependency is more nearest in maven dependency tree than the version (4.5.1) used by aws-sdk-java-core dependency. Anyway, I will cut down more details on this because I am pretty sure I should make everything work with one version of a maven dependency and not use multiple version of the same jar. Back to original question. As I cannot use v4.3.6, I told my code to use v4.5.1 and thats when file download code started giving problems. When I use httpclient v4.5.1, the response gives me following html content rather than giving me the zip file I have on the requested https url.

<html>
<HEAD><META HTTP-EQUIV='PRAGMA' CONTENT='NO-CACHE'><META HTTP-EQUIV='CACHE-    
CONTROL' CONTENT='NO-CACHE'>
<TITLE>SAML 2.0 Auto-POST form</TITLE>
</HEAD>
<body onLoad="document.forms[0].submit()">
<NOSCRIPT>Your browser does not support JavaScript.  Please click the    
'Continue' button below to proceed. <br><br>
</NOSCRIPT>
<form action="https://githubext.deere.com/saml/consume" method="POST">
<input type="hidden" name="SAMLResponse"  value="PFJlc3BvbnNlIHhtbG5zPSJ1cm46b2FzaXM6bmFtZXM6dGM6U0FNTDoyLjA6cHJvdG9jb2wiIERl">
<input type="hidden" name="RelayState" value="2F1HpzrUy5FdX">
<NOSCRIPT><INPUT TYPE="SUBMIT" VALUE="Continue"></NOSCRIPT>
</form>
</body>
</html>

When I use v4.3.6, the response gives me zip file as expected response. I have tried manually submitting this html content by adding more code but the response remains intact. The original code I have for file download is provided below.

@Component
public class FileDAO {

    public static void main(String args[]) throws Exception{
        new FileDAO().loadFile("https://some_url.domain.com/zipball/master","myfile.zip");
    }


    public String loadFile(String url, String fileName) throws ClientProtocolException, IOException {

        HttpClient client = login();
        HttpResponse response = client.execute(new HttpGet(url));
        int statusCode = response.getStatusLine().getStatusCode();
        if (statusCode == 200) {
            String unzipToFolderName = fileName.replace(".", "_");
            FileOutputStream outputStream = new FileOutputStream(new File(fileName));
            writeToFile(outputStream, response.getEntity().getContent());            
            return unzipToFolderName;
        } else {
            throw new RuntimeException("error downloading file, HTTP Status code: " + statusCode);
        }
    }

    private void writeToFile(FileOutputStream outputStream, InputStream inputStream)  {
        try {
            int read = 0;
            byte[] bytes = new byte[1024];
            while ((read = inputStream.read(bytes)) != -1) {
                outputStream.write(bytes, 0, read);
            }
        } catch (Exception ex) {
            throw new RuntimeException("error writing zip file, error message : " + ex.getMessage(), ex);
        } finally {
            try {
                outputStream.close();
                inputStream.close();
            } catch (Exception ex) {}
        }
    }

    private HttpClient login() throws IOException {
        HttpClient client = getHttpClient();

        HttpResponse response = client.execute(new HttpGet("https://some_url.domain.com"));
        String responseBody = EntityUtils.toString(response.getEntity());
        Document doc = Jsoup.parse(responseBody);
        org.jsoup.select.Elements inputs = doc.getElementsByTag("input");
        int statusCode = response.getStatusLine().getStatusCode();
        if (statusCode == 200) {
            HttpPost httpPost = new HttpPost("https://some_url.domain.com/saml/consume");
            List<NameValuePair> data = new ArrayList<NameValuePair>();
            data.add(new BasicNameValuePair("SAMLResponse", doc.select("input[name=SAMLResponse]").val()));
            data.add(new BasicNameValuePair("RelayState", doc.select("input[name=RelayState]").val()));
            httpPost.setEntity(new UrlEncodedFormEntity(data));
            HttpResponse logingResponse = client.execute(httpPost);
            int loginStatusCode = logingResponse.getStatusLine().getStatusCode();
            if (loginStatusCode != 302) {
                throw new RuntimeException("clone repo dao. error during login, HTTP Status code: " + loginStatusCode);
            }
        }
        return client;
    }

    private HttpClient getHttpClient() {
        CredentialsProvider provider = new BasicCredentialsProvider();
        UsernamePasswordCredentials credentials = new UsernamePasswordCredentials("userId", "password");
        provider.setCredentials(AuthScope.ANY, credentials);
        return HttpClientBuilder.create().setDefaultCredentialsProvider(provider).build();
    }
}

I am still analyzing what is going wrong with apache httpclient versions other than 4.3.6. Same code works with 4.3.6 but not with version above 4.3.6. Any help is really appreciated. Thank you all.

Issue resolved. After going through apache httpclient documentations with serious debugging the logs, I could resolve this issue. I had to create two server logs, one for v4.3.6 and other for v4.5.2. I started comparing the server logs and found that the culprit was cookie type. Cookie type in old version was (automatically) configured as BEST_MATCH and it was working. However, for v4.5.2, the BEST_MATCH cookie type has been deprecated from apache. I have been trying with cookie settings after adding some more code but the cookie sent by server response was not matching with the DEFAULT cookie type that I had configured in the client code. It was resulting in cookie not being setup properly and that's why the response was returning back SAML response (login page again) instead of zip file.

Apache cookie spec says this for the cookie specifications:

Default : Default cookie policy is a synthetic policy that picks up either RFC 2965, RFC 2109 or Netscape draft compliant implementation based on properties of cookies sent with the HTTP response (such as version attribute, now obsolete). This policy will be deprecated in favor of the standard (RFC 6265 compliant) implementation in the next minor release of HttpClient. Standard strict : State management policy compliant with the syntax and semantics of the well-behaved profile defined by RFC 6265, section 4.

I updated cookie configuration to STANDARD_STRICT mode and everything started working with the latest version 4.5.2.

Here is the updated getHttpClient() method:

private CloseableHttpClient getHttpClient() {
    CredentialsProvider provider = new BasicCredentialsProvider();
    UsernamePasswordCredentials credentials = new UsernamePasswordCredentials(gitUserId, gitPassword);
    provider.setCredentials(AuthScope.ANY, credentials);
    RequestConfig config = RequestConfig.custom().setCookieSpec(CookieSpecs.STANDARD_STRICT).build();
    return HttpClientBuilder.create().setDefaultCredentialsProvider(provider).setDefaultRequestConfig(config).setRedirectStrategy(new LaxRedirectStrategy()).build();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM