I need to automate the pull(get) of files from a big variety across different FTP services spread on different domains and that receive files on 24/7 basis.
My problem is that FTP services, in general, allow the download of a file while the file is yet being uploaded. This is one of the references to the problem that can be find at internet.
This can lead to incomplete file download.
I try replicate the situation using a windows server and a ftp FileZilla client and got half of the file as expected, so no safe mechanism was in place to prevent this. So maybe simple there is no way to prevent it from the client side.
So my question is if there is some anchor, something my client can test to check for sure that the ftp server already as the totality of the file.
I found hard to believe that a protocol has old as ftp don't provide safe mechanism, so i must be missing something, or this it is by design.
Update I am developing the automation in C#, but any technical tip can help. The solution need to bee fool prof because it is critical for the business.
update2 The upload are made by the many different clients, so it is impossible to establish a convention with all.
update3 This question is similar to question How to detect that a file is being uploaded over FTP , but has the additional restriction presented at update2.
I created the following automated solution based on inputs from answers at this post and others too, to address my problem as it is, meaning: Pull files from different FTP servers, from different brands,in a scenario where concurrency is much like to happen.
Using signal files or other mechanisms suggest in this post would require force clients to change the way they interact with us, so it is a solution for most cases but not a solution for my particular problem.
So, my solution was:
This solution allow us to poll ftp folders intensively.
I believe that from the client side, there's not much you can do.
At most, you could re-check the file size after some time and see whether it had changed and take whatever steps are required to get the new content.
FTP was not a designed as a protocol for kind of real time exchange of data between two clients using the FTP server. There is no kind of notification to a client if a file intended for download is still uploaded nor is their any indication when overwriting a file that somebody currently downloads this file. This is not a design error in the FTP protocol. The real problem is that you are trying to use a protocol for a purpose it was not designed for.
So you have this scenario:
[Publisher] --uploads file--> [FTP Server] --downloads file--> [You]
You have a publisher who is uploading files to an FTP server, and you download from the same FTP server. There can also be different FTP Server instances, one for upload and one for download, looking at the same directory, but that doesn't change much.
Now because you're looking at the same directory, you, the downloader, see files as soon as the filesystem entry is created - when the first bytes from the publisher may even still be in flight.
There are basically three solutions for this:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.