简体   繁体   中英

Processing SFTP files using C# Parallel.ForEach loop not processing downloads

I am using the Renci SSH.NET package version 2016. I am downloading files from an external server. I usually can download about one file every 6 seconds which is bad when you have thousands of files. I recently tried to change the foreach loops to Parallel.ForEach . Doing that changed the files downloaded times to 1.5 seconds. Except when I checked the files they all had 0 KB's so it did not download anything. Is there anything wrong with the parallel loop? I am new to C# and trying to improve download times

Parallel.ForEach(summary.RemoteFiles, (f, loopstate) =>
{
    //Are we still connected? If not, reestablish a connection for up to a max of "MaxReconnectAttempts" 
    if (!sftp.IsConnected)
    {
        int maxAttempts = Convert.ToInt32(ConfigurationManager.AppSettings["MaxReconnectAttempts"]);

        StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service has been connected from remote system, attempting to reconnect (" + sftpConnInfo.Host + ":" + sftpConnInfo.Port.ToString() + remotePath + " - Attempt 1 of " + maxAttempts.ToString() + ")", Location = locationName });

        for (int attempts = 1; attempts <= maxAttempts; attempts++)
        {
            sftp.Connect();

            if (sftp.IsConnected)
            {
                StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service - Connection reestablished (" + remotePath + ")", Location = locationName });
                break;
            }
            else
            {
                if ((attempts + 1) <= maxAttempts)
                {
                    StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service still disconnected from remote system, preparing another reconnect attempt (" + sftpConnInfo.Host + ":" + sftpConnInfo.Port.ToString() + remotePath + " - Attempt " + (attempts + 1).ToString() + " of " + maxAttempts.ToString() + ")", Location = locationName });
                    System.Threading.Thread.Sleep(2000);
                }
                else
                {
                    //Max reconnect attempts reached - end the session and ensure the appropriate "failure" workflow is triggered
                    connectionLost = true;
                }
            }
        }
    }

    if (connectionLost)
        loopstate.Break();
       // break;


    totalFileCount++;
    try
    {
      if (!System.IO.File.Exists(localSaveLocation + f.FileName))

        {
            System.Diagnostics.Debug.WriteLine("\tDownloading file " + totalFileCount.ToString() + "(" + f.FileName + ")");

            System.IO.Stream localFile = System.IO.File.OpenWrite(localSaveLocation + f.FileName);
            //Log remote file name, local file name, date/time start
            start = DateTime.Now;
            sftp.DownloadFile(f.FullName, localFile);
            end = DateTime.Now;

            //Log remote file name, local file name, date/time complete (increment the "successful" downloads by 1)
            timeElapsed = end.Subtract(start);
            runningSeconds += timeElapsed.TotalSeconds;
            runningAvg = runningSeconds / Convert.ToDouble(totalFileCount);
            estimatedSecondsRemaining = (summary.RemoteFiles.Count - totalFileCount) * runningAvg;

            elapsedTimeString = timeElapsed.TotalSeconds.ToString("#.####") + " seconds";
            System.Diagnostics.Debug.WriteLine("\tCompleted downloading file in " + elapsedTimeString + " " + "(" + f.FileName + ")");
            downloadedFileCount++;
            ProcessFileComplete(this, new Types.ProcessFileCompleteEventArgs() { downloadSuccessful = true, elapsedTime = timeElapsed.TotalSeconds, fileName = f.FileName, fullLocalPath = localSaveLocation + f.FileName, Location = locationName, FilesDownloaded = totalFileCount, FilesRemaining = (summary.RemoteFiles.Count - totalFileCount), AvgSecondsPerDownload = runningAvg, TotalSecondsElapsed = runningSeconds, EstimatedTimeRemaining = TimeSpan.FromSeconds(estimatedSecondsRemaining) });

            f.FileDownloaded = true;

            if (deleteAfterDownload)
                sftp.DeleteFile(f.FullName);
        }
        else
        {
            System.Diagnostics.Debug.WriteLine("\tFile " + totalFileCount.ToString() + "(" + f.FileName + ") already exists locally");
            downloadedFileCount++;

            ProcessFileComplete(this, new Types.ProcessFileCompleteEventArgs() { downloadSuccessful = true, elapsedTime = 0, fileName = f.FileName + " (File already exists locally)", fullLocalPath = localSaveLocation + f.FileName, Location = locationName, FilesDownloaded = totalFileCount, FilesRemaining = (summary.RemoteFiles.Count - totalFileCount), AvgSecondsPerDownload = runningAvg, TotalSecondsElapsed = runningSeconds, EstimatedTimeRemaining = TimeSpan.FromSeconds(estimatedSecondsRemaining) });
            f.FileDownloaded = true;

            if (deleteAfterDownload)
                sftp.DeleteFile(f.FullName);
        }
    }
    catch (System.Exception ex)
    {
       // We log stuff here
    }

}); 

I cannot tell why you get empty file. Though I'd suspect the fact that you do not close the localFile stream.

Though, even if your code worked, you will get hardly any performance benefit if you use the same connection for the downloads, as SFTP transfers tend to be limited by a network latency or CPU. You have to use multiple connections to overcome that.

See my answer on Server Fault about factors that affect SFTP transfer speed .

Implement some connection pool and pick a free connection each time.


Simple example:

var clients = new ConcurrentBag<SftpClient>();

var opts = new ParallelOptions { MaxDegreeOfParallelism = maxConnections };

Parallel.ForEach(files, opts, (f, loopstate) => {
    if (!clients.TryTake(out var client))
    {
        client = new SftpClient(hostName, userName, password);
        client.Connect();
    }

    string localPath = Path.Combine(destPath, f.Name);
    Console.WriteLine(
        "Thread {0}, Connection {1}, File {2} => {3}",
        Thread.CurrentThread.ManagedThreadId, client.GetHashCode(),
        f.FullName, localPath);

    using (var stream = File.Create(localPath))
    {
        client.DownloadFile(f.FullName, stream);
    }

    clients.Add(client);
});

Console.WriteLine("Closing {0} connections", clients.Count);

foreach (var client in clients)
{
    client.Dispose();
}

Another approach is to start a fixed number of threads with one connection for each and have them pick files from a queue.

For an example of implementation see my article for WinSCP .NET assembly:
Automating transfers in parallel connections over SFTP/FTP protocol


A similar question about FTP:
Download multiple files concurrently from FTP using FluentFTP with a maximum value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM