[英]Limiting concurrent requests using Rx and SelectMany
我有一個要使用HttpClient
同時下載的頁面的 URL 列表。 URL 列表可能很大(100 個或更多!)
我目前有這個代碼:
var urls = new List<string>
{
@"http:\\www.amazon.com",
@"http:\\www.bing.com",
@"http:\\www.facebook.com",
@"http:\\www.twitter.com",
@"http:\\www.google.com"
};
var client = new HttpClient();
var contents = urls
.ToObservable()
.SelectMany(uri => client.GetStringAsync(new Uri(uri, UriKind.Absolute)));
contents.Subscribe(Console.WriteLine);
問題:由於SelectMany
的使用,幾乎同時創建了一大堆任務。 似乎如果 URL 列表足夠大,很多任務都會超時(我收到“任務被取消”異常)。
所以,我認為應該有一種方法,也許是使用某種調度程序,來限制並發任務的數量,在給定時間不允許超過 5 或 6 個。
通過這種方式,我可以獲得並發下載,而無需啟動太多可能會停滯的任務,就像他們現在所做的那樣。
如何做到這一點,這樣我就不會因大量超時任務而飽和?
記住SelectMany()
實際上是Select().Merge()
。 SelectMany
沒有maxConcurrent
參數,而Merge()
有。 所以你可以使用它。
從您的示例中,您可以執行以下操作:
var urls = new List<string>
{
@"http:\\www.amazon.com",
@"http:\\www.bing.com",
@"http:\\www.facebook.com",
@"http:\\www.twitter.com",
@"http:\\www.google.com"
};
var client = new HttpClient();
var contents = urls
.ToObservable()
.Select(uri => Observable.FromAsync(() => client.GetStringAsync(uri)))
.Merge(2); // 2 maximum concurrent requests!
contents.Subscribe(Console.WriteLine);
以下是如何使用DataFlow API 執行此操作的示例:
private static Task DoIt()
{
var urls = new List<string>
{
@"http:\\www.amazon.com",
@"http:\\www.bing.com",
@"http:\\www.facebook.com",
@"http:\\www.twitter.com",
@"http:\\www.google.com"
};
var client = new HttpClient();
//Create a block that takes a URL as input
//and produces the download result as output
TransformBlock<string,string> downloadBlock =
new TransformBlock<string, string>(
uri => client.GetStringAsync(new Uri(uri, UriKind.Absolute)),
new ExecutionDataflowBlockOptions
{
//At most 2 download operation execute at the same time
MaxDegreeOfParallelism = 2
});
//Create a block that prints out the result
ActionBlock<string> doneBlock =
new ActionBlock<string>(x => Console.WriteLine(x));
//Link the output of the first block to the input of the second one
downloadBlock.LinkTo(
doneBlock,
new DataflowLinkOptions { PropagateCompletion = true});
//input the urls into the first block
foreach (var url in urls)
{
downloadBlock.Post(url);
}
downloadBlock.Complete(); //Mark completion of input
//Allows consumer to wait for the whole operation to complete
return doneBlock.Completion;
}
static void Main(string[] args)
{
DoIt().Wait();
Console.WriteLine("Done");
Console.ReadLine();
}
你能看看這是否有幫助嗎?
var urls = new List<string>
{
@"http:\\www.amazon.com",
@"http:\\www.bing.com",
@"http:\\www.google.com",
@"http:\\www.twitter.com",
@"http:\\www.google.com"
};
var contents =
urls
.ToObservable()
.SelectMany(uri =>
Observable
.Using(
() => new System.Net.Http.HttpClient(),
client =>
client
.GetStringAsync(new Uri(uri, UriKind.Absolute))
.ToObservable()));
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.