简体   繁体   English

使用 WebClient DownloadStringAsync 下载时抛出 TargetInvocationException

[英]Throws TargetInvocationException when downloading with WebClient DownloadStringAsync

I am trying to download multiple webpages using the WebClient class.我正在尝试使用 WebClient 类下载多个网页。 When I try to download a website's html, a TargetInvocationException is thrown, and I do not know why it happens.当我尝试下载网站的 html 时,抛出 TargetInvocationException,我不知道为什么会这样。 Here is my code:这是我的代码:

    public HashSet<string> DownloadWebpages(HashSet<string> urls)
    {
        HashSet<string> HTML = new HashSet<string>();

        for (int i = 0; i < urls.Count; i++)
        {
            WebClient client = new WebClient();
            client.DownloadStringCompleted += (s, e) =>
            {
                try
                {
                    lock (HTML)
                    {
                        HTML.Add(e.Result); //The exception happens on this line  
                    }
                }
                catch { }
            };
            client.DownloadStringAsync(new Uri(urls.ElementAt(i)));
        }
        return HTML;
    }

Is there any way to fix this.有没有什么办法解决这一问题。 All I'm trying to do is download multiple webpages using async, trying to make it has fast as possible.我想要做的就是使用异步下载多个网页,试图让它尽可能快。

WebClient is an obsolete class replaced since 2012 by HttpClient. WebClient 是一个过时的类,自 2012 年以来被 HttpClient 取代。 It was never built with HTTP APIs or thread safety in mind.它从未在构建时考虑到 HTTP API 或线程安全。 It's easier to do what you want with HttpClient's GetStringAsync :使用 HttpClient 的GetStringAsync更容易做你想做的事:

public async Task<HashSet<string>> DownloadWebpages(IEnumerable<string> urls)
{
    HashSet<string> HTML = new HashSet<string>();

    var client=new HttpClient();
    foreach (var url in urls)
    {
        var source=await client.GetStringAsync(url);
        HTML.Add(source);
    }
    return HTML;
}

Since .NET 6 you can even retrieve the URLs concurrently with Parallel.ForEachAsync.从 .NET 6 开始,您甚至可以使用 Parallel.ForEachAsync 同时检索 URL。 In this case you'd need a thread-safe collection to store them, eg a ConcurrentDictionary:在这种情况下,您需要一个线程安全的集合来存储它们,例如 ConcurrentDictionary:

HttpClient _client=new HttpClient();

public async Task<ConcurrentDictionary<string,string>> DownloadWebpages(IEnumerable<string> urls)
{
    var HTML = new ConcurrentDictionary<string,string>();

    await Parallel.ForeachAsync(urls,async url=>{
    {
        var source=await client.GetStringAsync(url);
        HTML.Add(url,source);
    });
    return HTML;
}

HttpClient is thread-safe and meant to be reused. HttpClient线程安全的,可以重用。

If you absolutely have to use WebClient (why???) you can use DownloadStringTaskAsync .如果您绝对必须使用 WebClient(为什么???),您可以使用DownloadStringTaskAsync You won't be able to make concurrent calls though because WebClient isn't thread-safe.但是,您将无法进行并发调用,因为 WebClient 不是线程安全的。

public async Task<HashSet<string>> DownloadWebpages(IEnumerable<string> urls)
{
    HashSet<string> HTML = new HashSet<string>();

    var client=new WebClient();
    foreach (var url in urls)
    {
        var source=await client.DownloadStringTaskAsync(url);
        HTML.Add(source);
    }
    return HTML;
}

The TargetInvocationException is thrown, because the webclient is not able to download the website.抛出TargetInvocationException ,因为webclient无法下载该网站。 Here is a test,这是一个测试,

string html = new WebClient().DownloadString("https://www.siteth@tw!llcause an error!/randompage/");

This will cause an exception.这将导致异常。 So if you tried to download the same webpage with your code, it will cause an TargetInvocationException因此,如果您尝试使用您的代码下载相同的网页,则会导致TargetInvocationException

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM