简体   繁体   English

zip文件下载可与apache httpclient v4.3.6一起使用,但无法用于其他版本

[英]A zip file download works with apache httpclient v4.3.6 but fails for other version

This is a little weird and I have done close to enough research finding the cause and resolution of this issue. 这有点奇怪,我已经进行了足够的研究,找到了造成此问题的原因和解决方案。 My objective is to download a zip file from a secured URL that also requires login. 我的目标是从也需要登录的安全URL下载一个zip文件。 Everything works perfect when I use apache httpClient maven dependency of version 4.3.6. 当我使用版本4.3.6的apache httpClient maven依赖项时,一切工作正常。 However, I cannot use this version due to the fact that my aws-sdk-java-core maven dependency also has httpclient dependency and using v4.3.6 makes the aws-sdk-java complains about a NoSuchMethod runtime exception. 但是,由于我的aws-sdk-java-core maven依赖项也具有httpclient依赖项,因此无法使用此版本,并且使用v4.3.6会使aws-sdk-java抱怨NoSuchMethod运行时异常。 I understand this issue. 我了解这个问题。 The reason is that apache httpclient v4.3.6 dependency is more nearest in maven dependency tree than the version (4.5.1) used by aws-sdk-java-core dependency. 原因是apache httpclient v4.3.6依赖关系在maven依赖关系树中比aws-sdk-java-core依赖关系使用的版本(4.5.1)更近。 Anyway, I will cut down more details on this because I am pretty sure I should make everything work with one version of a maven dependency and not use multiple version of the same jar. 无论如何,我将减少更多细节,因为我很确定我应该使所有工作都与一个Maven依赖项版本一起使用,而不要使用同一jar的多个版本。 Back to original question. 回到原始问题。 As I cannot use v4.3.6, I told my code to use v4.5.1 and thats when file download code started giving problems. 由于无法使用v4.3.6,我告诉我的代码使用v4.5.1,这就是当文件下载代码开始出现问题时。 When I use httpclient v4.5.1, the response gives me following html content rather than giving me the zip file I have on the requested https url. 当我使用httpclient v4.5.1时,响应会给我以下html内容,而不是给我在请求的https url上拥有的zip文件。

<html>
<HEAD><META HTTP-EQUIV='PRAGMA' CONTENT='NO-CACHE'><META HTTP-EQUIV='CACHE-    
CONTROL' CONTENT='NO-CACHE'>
<TITLE>SAML 2.0 Auto-POST form</TITLE>
</HEAD>
<body onLoad="document.forms[0].submit()">
<NOSCRIPT>Your browser does not support JavaScript.  Please click the    
'Continue' button below to proceed. <br><br>
</NOSCRIPT>
<form action="https://githubext.deere.com/saml/consume" method="POST">
<input type="hidden" name="SAMLResponse"  value="PFJlc3BvbnNlIHhtbG5zPSJ1cm46b2FzaXM6bmFtZXM6dGM6U0FNTDoyLjA6cHJvdG9jb2wiIERl">
<input type="hidden" name="RelayState" value="2F1HpzrUy5FdX">
<NOSCRIPT><INPUT TYPE="SUBMIT" VALUE="Continue"></NOSCRIPT>
</form>
</body>
</html>

When I use v4.3.6, the response gives me zip file as expected response. 当我使用v4.3.6时,响应将给我zip文件作为预期响应。 I have tried manually submitting this html content by adding more code but the response remains intact. 我尝试通过添加更多代码来手动提交此html内容,但响应保持不变。 The original code I have for file download is provided below. 下面提供了我用于文件下载的原始代码。

@Component
public class FileDAO {

    public static void main(String args[]) throws Exception{
        new FileDAO().loadFile("https://some_url.domain.com/zipball/master","myfile.zip");
    }


    public String loadFile(String url, String fileName) throws ClientProtocolException, IOException {

        HttpClient client = login();
        HttpResponse response = client.execute(new HttpGet(url));
        int statusCode = response.getStatusLine().getStatusCode();
        if (statusCode == 200) {
            String unzipToFolderName = fileName.replace(".", "_");
            FileOutputStream outputStream = new FileOutputStream(new File(fileName));
            writeToFile(outputStream, response.getEntity().getContent());            
            return unzipToFolderName;
        } else {
            throw new RuntimeException("error downloading file, HTTP Status code: " + statusCode);
        }
    }

    private void writeToFile(FileOutputStream outputStream, InputStream inputStream)  {
        try {
            int read = 0;
            byte[] bytes = new byte[1024];
            while ((read = inputStream.read(bytes)) != -1) {
                outputStream.write(bytes, 0, read);
            }
        } catch (Exception ex) {
            throw new RuntimeException("error writing zip file, error message : " + ex.getMessage(), ex);
        } finally {
            try {
                outputStream.close();
                inputStream.close();
            } catch (Exception ex) {}
        }
    }

    private HttpClient login() throws IOException {
        HttpClient client = getHttpClient();

        HttpResponse response = client.execute(new HttpGet("https://some_url.domain.com"));
        String responseBody = EntityUtils.toString(response.getEntity());
        Document doc = Jsoup.parse(responseBody);
        org.jsoup.select.Elements inputs = doc.getElementsByTag("input");
        int statusCode = response.getStatusLine().getStatusCode();
        if (statusCode == 200) {
            HttpPost httpPost = new HttpPost("https://some_url.domain.com/saml/consume");
            List<NameValuePair> data = new ArrayList<NameValuePair>();
            data.add(new BasicNameValuePair("SAMLResponse", doc.select("input[name=SAMLResponse]").val()));
            data.add(new BasicNameValuePair("RelayState", doc.select("input[name=RelayState]").val()));
            httpPost.setEntity(new UrlEncodedFormEntity(data));
            HttpResponse logingResponse = client.execute(httpPost);
            int loginStatusCode = logingResponse.getStatusLine().getStatusCode();
            if (loginStatusCode != 302) {
                throw new RuntimeException("clone repo dao. error during login, HTTP Status code: " + loginStatusCode);
            }
        }
        return client;
    }

    private HttpClient getHttpClient() {
        CredentialsProvider provider = new BasicCredentialsProvider();
        UsernamePasswordCredentials credentials = new UsernamePasswordCredentials("userId", "password");
        provider.setCredentials(AuthScope.ANY, credentials);
        return HttpClientBuilder.create().setDefaultCredentialsProvider(provider).build();
    }
}

I am still analyzing what is going wrong with apache httpclient versions other than 4.3.6. 我仍在分析4.3.6以外的Apache httpclient版本出了什么问题。 Same code works with 4.3.6 but not with version above 4.3.6. 相同的代码适用于4.3.6,但不适用于高于4.3.6的版本。 Any help is really appreciated. 任何帮助都非常感谢。 Thank you all. 谢谢你们。

Issue resolved. 问题解决了。 After going through apache httpclient documentations with serious debugging the logs, I could resolve this issue. 在经过apache httpclient文档并认真调试日志之后,我可以解决此问题。 I had to create two server logs, one for v4.3.6 and other for v4.5.2. 我必须创建两个服务器日志,一个用于v4.3.6,另一个用于v4.5.2。 I started comparing the server logs and found that the culprit was cookie type. 我开始比较服务器日志,发现罪魁祸首是cookie类型。 Cookie type in old version was (automatically) configured as BEST_MATCH and it was working. 旧版本的Cookie类型已(自动)配置为BEST_MATCH,并且可以正常工作。 However, for v4.5.2, the BEST_MATCH cookie type has been deprecated from apache. 但是,对于v4.5.2,已从Apache中弃用了BEST_MATCH cookie类型。 I have been trying with cookie settings after adding some more code but the cookie sent by server response was not matching with the DEFAULT cookie type that I had configured in the client code. 添加更多代码后,我一直在尝试使用cookie设置,但是服务器响应发送的cookie与我在客户端代码中配置的DEFAULT cookie类型不匹配。 It was resulting in cookie not being setup properly and that's why the response was returning back SAML response (login page again) instead of zip file. 这导致cookie无法正确设置,这就是为什么响应返回SAML响应(再次登录页面)而不是zip文件的原因。

Apache cookie spec says this for the cookie specifications: Apache cookie规范针对cookie规范说明了这一点:

Default : Default cookie policy is a synthetic policy that picks up either RFC 2965, RFC 2109 or Netscape draft compliant implementation based on properties of cookies sent with the HTTP response (such as version attribute, now obsolete). 默认值 :默认cookie策略是一种综合策略,它基于随HTTP响应发送的cookie的属性(例如,版本属性,现在已过时),采用RFC 2965,RFC 2109或Netscape草案兼容的实现。 This policy will be deprecated in favor of the standard (RFC 6265 compliant) implementation in the next minor release of HttpClient. 在下一个次要版本的HttpClient中,将不推荐使用此策略,而推荐使用标准(兼容RFC 6265)。 Standard strict : State management policy compliant with the syntax and semantics of the well-behaved profile defined by RFC 6265, section 4. 严格标准 :符合RFC 6265第4节定义的行为规范的语法和语义的状态管理策略。

I updated cookie configuration to STANDARD_STRICT mode and everything started working with the latest version 4.5.2. 我将cookie配置更新为STANDARD_STRICT模式,并且一切都开始在最新版本4.5.2下工作。

Here is the updated getHttpClient() method: 这是更新的getHttpClient()方法:

private CloseableHttpClient getHttpClient() {
    CredentialsProvider provider = new BasicCredentialsProvider();
    UsernamePasswordCredentials credentials = new UsernamePasswordCredentials(gitUserId, gitPassword);
    provider.setCredentials(AuthScope.ANY, credentials);
    RequestConfig config = RequestConfig.custom().setCookieSpec(CookieSpecs.STANDARD_STRICT).build();
    return HttpClientBuilder.create().setDefaultCredentialsProvider(provider).setDefaultRequestConfig(config).setRedirectStrategy(new LaxRedirectStrategy()).build();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM