简体   繁体   English

几乎下载完后,Apache FTPClient无法检索文件

[英]Apache FTPClient fails to retrieve a file when it's almost downloaded

Apache's FTPClient fails to download a file which is perfectly downloaded by FileZilla. Apache的FTPClient无法下载FileZilla完美下载的文件。

Basically, what I'm trying to do after a successfull login and listing is to download one specific file: 基本上,成功登录并列出列表后,我要执行的操作是下载一个特定文件:

FTPClient client = new FTPClient();
client.setDataTimeout(20000);
client.setConnectTimeout(20000);
client.setBufferSize(65536);
//...
client.connect(host);
client.login(user, pswd);
// response validation
client.enterLocalPassiveMode();
// some listings with validations

InputStream in = new BufferedInputStream(client.retrieveFileStream(ftpFilePath), 16384);
// ...
byte[] buffer = new byte[8192];
while ((rd = in.read(buffer)) > 0) {
// .. reading the file and updating download progress

The last lines could be easily replaced with FTPClient 's file download with virtually the same result, but then we can't track the download progress: 最后几行可以很容易地用FTPClient的文件下载替换,结果几乎相同,但是随后我们无法跟踪下载进度:

client.setControlKeepAliveTimeout(30);
client.retrieveFile(ftpFilePath, new org.apache.commons.io.output.NullOutputStream());

As a result of all those actions I can see the file is downloading until something very close to 100% and then the exception occurs: 所有这些操作的结果是,我可以看到文件正在下载,直到非常接近100%,然后发生了异常:

java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(Unknown Source)
        at java.net.SocketInputStream.read(Unknown Source)
        at java.net.SocketInputStream.read(Unknown Source)
        at java.io.FilterInputStream.read(Unknown Source)
        at java.io.BufferedInputStream.fill(Unknown Source)
        at java.io.BufferedInputStream.read1(Unknown Source)
        at java.io.BufferedInputStream.read(Unknown Source)
        at java.io.FilterInputStream.read(Unknown Source)
        at <my code from here on>

There seems to be no firewall, but when the internet connection speed is better the download succedes (probably, some sort of timeout is hit). 似乎没有防火墙,但是当Internet连接速度更好时,下载成功(可能发生某种超时)。 I would think that the problem is with the connection, but the thing is FileZilla succedes to download the same file. 我认为问题出在连接,但是问题是FileZilla成功下载了相同的文件。

So, I can reformulate my question like this: how do I make my FTPClient behave like FileZilla when downloading a file. 因此,我可以这样重述我的问题:下载文件时如何使FTPClient表现得像FileZilla。 Probably, there's some complicated retry on ping logic that I'm not aware of. 可能有一些我不知道的关于ping逻辑的复杂重试。

Commons Net: commons-net-3.6 Commons Net:commons-net-3.6

FTP Server: proftpd-1.3.3g-6.el5 on CentOS 5.8 with a default configuration, does not support FTP over TLS. FTP服务器:CentOS 5.8上的proftpd-1.3.3g-6.el5具有默认配置,不支持基于TLS的FTP。

Seem like it's caused by the data timeout defined at the following line: 似乎是由以下行中定义的数据超时引起的:

client.setDataTimeout(20000);

According to JavaDoc: 根据JavaDoc:

Sets the timeout in milliseconds to use when reading from the data connection. 设置从数据连接读取时使用的超时(以毫秒为单位)。 This timeout will be set immediately after opening the data connection, provided that the value is ≥ 0. 如果值≥0,则在打开数据连接后将立即设置此超时。

Note: the timeout will also be applied when calling accept() whilst establishing an active local data connection. 注意:在建立活动的本地数据连接时调用accept()时也会应用超时。

Parameters: timeout - The default timeout in milliseconds that is used when opening a data connection socket. 参数:timeout-打开数据连接套接字时使用的默认超时(以毫秒为单位)。 The value 0 means an infinite timeout. 值0表示无限超时。

May you try to set this value to 0 (which means infinite in this context)? 您可以尝试将此值设置为0(在这种情况下意味着无穷大)吗?

I don't know what is the actual reason for this phenomena (time out on the last chunk of a file), but I've checked with Wireshark what it is that FileZilla does to download a file and found that it suffers from the very same problem with the same timeout and it is reconnecting to a server and sending a REST FTP query to restart the download of this specific file from when it was aborted that is to download only the last chunk. 我不知道这种现象的真正原因是什么(在文件的最后一个块上超时),但是我与Wireshark一起检查了FileZilla下载文件的含义是什么,发现它遭受了具有相同超时的相同问题,它正在重新连接到服务器并发送REST FTP查询以从中止仅下载最后一块的特定文件重新开始下载。

So, the solution would be to add some sort of retry logic to the download process, so that this chunk of code: 因此,解决方案是在下载过程中添加某种重试逻辑,以便使这段代码:

InputStream in = new BufferedInputStream(client.retrieveFileStream(ftpFilePath), 16384);
// ...
byte[] buffer = new byte[8192];
while ((rd = in.read(buffer)) > 0) {

Becomes this: 变成这个:

InputStream in = new BufferedInputStream(client.retrieveFileStream(ftpFilePath), 16384);
// ...
byte[] buffer = new byte[8192];
long totalRead = 0;
for (int resumeTry = 0; resumeTry <= RESUME_TRIES; ++resumeTry) {
    try {
        while ((rd = in.read(buffer)) > 0) {
            //...
            totalRead += rd;
        }
        break;
    } catch (SocketTimeoutException ex) {
        // omitting exception handling
        in.close();
        client.abort();
        client.connect(...);
        client.login(...);
        client.setFileType(FTPClient.BINARY_FILE_TYPE);
        client.enterLocalPassiveMode();
        client.setRestartOffset(totalRead);
        in = client.retrieveFileStream(...);
        if (in == null) {
            // the FTP server doesn't support REST FTP query
            throw ex;
        }
        in = new BufferedInputStream(in, 16384);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM