简体   繁体   中英

C# Async Zip File Extraction

I've done some looking, but haven't found much in the way of example async methods that are not either incredibly complex or very simple.

I've been trying to make an application more efficient, and am convinced I don't need to implement my own threading. Basically this code is from an application that already has functioning async calls that use Cookie Aware Web Client code to log into an HTTPS site, grab the cookie, enumerate a specific page with the authentication cookie and then download specific files. Said specific files are "zip" format with extension ".bfp". This section of code below is used to extract the zip files (there are tens of thousands of them from over 100 source IPs being downloaded into a folder structure).

My problem is that the application freezes part way through parsing the files using this async setup, but once in a while will finish. Sometimes it hangs and just sits there, other times it crashes and Windows Error Reporting pops up.

As a note, I have a BS CIS degree, but I am not a programmer in my day to day job. I am a systems engineer/architect (I have a BS and MS in that as well). I have no one here to bounce my code off that has any programming experience. What I know about the async/await libraries I've learned from Channel9 videos from MS and Google. If this implementation looks terrible, it could be, and I do NOT claim to be an expert, especially with async. I am leaving out most of the code for the other parts of the app that are running fine, as it has information I would prefer not to share and it is not relevant to the question/problem.

private async void ParseZipFiles()
{

    await UpdateMain("Started Parsing Compressed Log Files." + Environment.NewLine);
    FileInfo[] diZip = new DirectoryInfo(@"C:\LogFiles\ZipFiles\").GetFiles("*.bfp",SearchOption.AllDirectories);
    await Task.WhenAll(diZip.Select(async s => await Task.Run(async () => await ParseLogFile(s.FullName))));
    await UpdateMain("Finished Parsing Compressed Log Files." + Environment.NewLine);
}

private async Task ParseZipFile(string filename)
{
    try
    {
        using (ZipArchive bfpFile = await Task.Run(async () => new ZipArchive((Stream)new FileStream(filename, FileMode.Open))))
        {
            await Task.WhenAll(bfpFile.Entries.Select(async s => await Task.Run(async () =>
            {
               if (s.FullName.EndsWith(".log", StringComparison.OrdinalIgnoreCase) && s.Name.Split('.').First().All(char.IsDigit) == true && s.Name.Split('.').Count() == 2)
                {
                    string extractPath = Path.Combine(@"C:\LogFiles\Extracted\" + filename.Split('\\')[3].ToString() + @"\", s.Name);
                    await Task.Run(async () => await Task.Run(() =>  s.ExtractToFile(extractPath, true)));
                }
            })));
        }
    }
    catch
    {
        await UpdateMain("Compressed archive: " + filename + " is corrupted. Probably on the source system." + Environment.NewLine);
    }
}

I have used Task.WhenAll() before, and I think you want to simplify any line with that in it. You have way more await statements than you need, and Task.Run will start in a new thread. It's likely you are creating conditions where nothing is finishing because of the way you've structured the await statements. So why not have one thread per zip?

using (ZipArchive bfpFile = new ZipArchive(new FileStream(filename, 
FileMode.Open))) {
await Task.WhenAll(bfpFile.Entries.Select(s => Task.Run(() =>
{
    if (s.FullName.EndsWith(".log", StringComparison.OrdinalIgnoreCase) && s.Name.Split('.').First().All(char.IsDigit) == true && s.Name.Split('.').Count() == 2)
    {
         string extractPath = Path.Combine(@"C:\LogFiles\Extracted\" + filename.Split('\\')[3].ToString() + @"\", s.Name);
         s.ExtractToFile(extractPath, true)));
     }
}
)));
}

Also, change this line of code

await Task.WhenAll(diZip.Select(s => Task.Run(() => 
ParseLogFile(s.FullName))));

Not ultimately sure if that line was meant to call ParseZipFile() instead of ParseLogFile() but you have the code and I don't.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM