简体   繁体   English

同步下载多个文件C#

[英]Downloading multiple files synchronously C#

I'm trying to download multiple files with several threads. 我正在尝试使用多个线程下载多个文件。 The program uses a BFS algorithm to reach all the files given a particular url: http://www.police.am/Hanraqve/ The problem is that the same file can be downloaded multiple times as several threads are released. 该程序使用BFS算法访问给定特定URL的所有文件: http : //www.police.am/Hanraqve/问题是随着释放多个线程,同一文件可以多次下载。 I'm thinking of a way to synchronize the download process so that each file is downloaded once only with the help of Mutexes or Semaphores. 我正在考虑一种同步下载过程的方法,以便每个文件仅在互斥体或信号灯的帮助下才下载一次。 Any idea or actual code would be very much appreciated. 任何想法或实际代码将不胜感激。 Here is my initial code 这是我的初始代码

    public static async Task Download()
    {
        nodes.Enqueue(root);
        while (nodes.Count() != 0)
        {
            String currentNode = "";
            if (nodes.TryDequeue(out currentNode))
            {
                if (!visitedNodes.Contains(currentNode))
                {
                    visitedNodes.Add(currentNode);
                    if (isFolder(currentNode))
                    {
                        List<String> urls = GetUrlsFromHtml(currentNode);
                        foreach (String url in urls)
                        {
                            nodes.Enqueue(url);
                        }
                    }
                    else
                    {
                        string fileName = currentNode.Remove(0, currentNode.LastIndexOf('/') + 1);

                        using (WebClient webClient = new WebClient())
                        {
                            await webClient.DownloadFileTaskAsync(new Uri(currentNode), destinationFolderPath + @"\" + fileName);
                            files.Enqueue(destinationFolderPath + @"\" + fileName);
                        }
                    }
                }

            }
        }


        //cts.Cancel();
    }

    public static List<String> GetUrlsFromHtml(string url)
    {
        HtmlWeb hw = new HtmlWeb();
        HtmlDocument doc = hw.Load(url);
        List<String> urls = new List<String>();
        foreach (HtmlNode htmlNode in doc.DocumentNode.SelectNodes("//a[@href]"))
        {
            string hrefValue = htmlNode.Attributes["href"].Value;
            if (hrefValue[0] >= '1' && hrefValue[0] <= '9')
            {
                urls.Add(url + hrefValue);
            }
        }
        return urls;
    }

    public static bool isFolder(string url)
    {
        return url.EndsWith("/");
    }
}

} }

Check the urls your storing in visited, they may be different but still go to the same page. 检查您存储在所访问的URL,它们可能不同,但仍转到同一页面。

http://foo.com?a=bar http://foo.bar?b=foo http://foo.com?a=bar http://foo.bar?b=foo

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM