简体   繁体   English

如何使用WebClient和多个线程获取API数据?

[英]How to fetch API data using WebClient and multiple threads?

So I am trying to query an API that's accessible via HTTP ( no authorization ). 因此,我试图查询可通过HTTP(无授权)访问的API。 To speed things up, I tried to use a Parallel.ForEach loop but it seems like the longer it runs, the more errors pop up. 为了加快处理速度,我尝试使用Parallel.ForEach循环,但看起来运行时间越长,弹出的错误越多。

It fails to retrieve more and more requests. 它无法检索越来越多的请求。 I know the API provider isn't limiting me because I can request the very same blocked URLs in my Internet browser. 我知道API提供者并没有限制我,因为我可以在Internet浏览器中请求完全相同的被阻止的URL。 Also, these are different failed URLs each time, so it doesn't seem to be the case of malformed requests. 另外,这些每次都是不同的失败URL,因此似乎不是格式错误的请求。

The error doesn't seem to occur while I use single threaded foreach loop. 当我使用单线程foreach循环时,似乎没有发生该错误。

My malfunctioning loop is below: 我的故障循环如下:

Parallel.ForEach(this.urlArray, singleUrl => {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}

Normal foreach loop works fine but is very slow: 正常的foreach循环工作正常,但速度很慢:

foreach (string singleUrl in this.urlArray) {
this.apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl);
this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}

Also: I've had a solution in PHP - I spawned several "fetchers" simultaneously and it never hung up. 另外:我在PHP中有一个解决方案-我同时产生了多个“提取程序”,但从未挂断。 It seems strange to me that PHP would handle multithreaded retrieval better than C# so I must obviously miss something. PHP会比C#更好地处理多线程检索,这对我来说似乎很奇怪,因此我显然必须错过一些事情。

How do I query the API fastest way? 如何快速查询API? Without these strange failures? 没有这些奇怪的失败?

Hi did you try to speed up your code with a sync downloads like in this question (see marked answer): 嗨,您是否尝试过通过同步下载来加速代码,例如以下问题(请参见标记的答案):

DownloadStringAsync wait for request completion DownloadStringAsync等待请求完成

your could loop through your uris and get a callback for each successfull download. 您可以遍历您的uri,并为每次成功下载获得回调。

EDIT : i have seen that you use 编辑:我已经看到你使用

this.apiResponseBlob = DL

when you use multithreading every thread tries to write in that variable. 使用多线程时,每个线程都会尝试写入该变量。 This could be a reason vor your bug. 这可能是您的错误的原因。 Try using an instance of that object type or use 尝试使用该对象类型的实例或使用

lock{}

so that only one thread can write this variable at time. 因此只有一个线程可以同时写入此变量。 http://msdn.microsoft.com/de-de/library/c5kehkcz.aspx http://msdn.microsoft.com/de-de/library/c5kehkcz.aspx

like 喜欢

    Parallel.ForEach(this.urlArray, singleUrl => {
    var apiResponseBlob = new System.Net.WebClient ().DownloadString(singleUrl );
    lock(singleUrl.ToString()){
    this.responsesDictionary.Add(singleUrl, apiResponseBlob);
}
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM