简体   繁体   English

在 Powershell 中运行时与在 Visual Studio 中运行时的 HttpClient 并发行为不同

[英]HttpClient concurrent behavior different when running in Powershell than in Visual Studio

I'm migrating millions of users from on-prem AD to Azure AD B2C using MS Graph API to create the users in B2C.我正在使用 MS Graph API 将数百万用户从本地 AD 迁移到 Azure AD B2C,以在 B2C 中创建用户。 I've written a .Net Core 3.1 console application to perform this migration.我编写了一个 .Net Core 3.1 控制台应用程序来执行此迁移。 To speed things along I'm making concurrent calls to the Graph API.为了加快速度,我正在对 Graph API 进行并发调用。 This is working great - sort of.这很好用 - 有点。

During development I experienced acceptable performance while running from Visual Studio 2019, but for test I'm running from the command line in Powershell 7. From Powershell the performance of concurrent calls to the HttpClient is very bad.在开发过程中,我从 Visual Studio 2019 运行时体验到了可接受的性能,但为了测试,我从 Powershell 7 的命令行运行。从 Powershell 并发调用 HttpClient 的性能非常差。 It appears that there's a limit to the number of concurrent calls that HttpClient is allowing when running from Powershell, so calls in concurrent batches greater than 40 to 50 requests start to stack up.从 Powershell 运行时,HttpClient 允许的并发调用数似乎存在限制,因此并发批处理中大于 40 到 50 个请求的调用开始堆积。 It seems to be running 40 to 50 concurrent requests while blocking the rest.它似乎正在运行 40 到 50 个并发请求,同时阻止其余请求。

I'm not looking for assistance with async programming.我不是在寻求异步编程方面的帮助。 I'm looking for a way to trouble shoot the difference between Visual Studio run-time behavior and Powershell command line run-time behavior.我正在寻找一种方法来解决 Visual Studio 运行时行为和 Powershell 命令行运行时行为之间的差异。 Running in release mode from Visual Studio's green arrow button behaves as expected.从 Visual Studio 的绿色箭头按钮在发布模式下运行的行为与预期一致。 Running from the command line does not.从命令行运行不会。

I fill a task list with async calls and then await Task.WhenAll(tasks).我用异步调用填充任务列表,然后等待 Task.WhenAll(tasks)。 Each call takes between 300 and 400 milliseconds.每个调用需要 300 到 400 毫秒。 When running from Visual Studio it works as expected.从 Visual Studio 运行时,它按预期工作。 I make concurrent batches of 1000 calls and each individually completes within the expected time.我同时进行 1000 次调用,每个调用都在预期时间内单独完成。 The whole task block takes just a few milliseconds longer than the longest individual call.整个任务块只比最长的单个调用长几毫秒。

The behavior changes when I run the same build from the Powershell command line.当我从 Powershell 命令行运行相同的构建时,行为会发生变化。 The first 40 to 50 calls take the expected 300 to 400 milliseconds but then the individual call times grow up to 20 seconds each.前 40 到 50 次调用预计需要 300 到 400 毫秒,但随后各个调用时间会增加到每次 20 秒。 I think the calls are serializing, so only 40 to 50 are being executed at a time while the others wait.我认为调用是序列化的,所以一次只执行 40 到 50 个,而其他人则在等待。

After hours of trial and error I was able to narrow it down to the HttpClient.经过数小时的反复试验,我能够将其缩小到 HttpClient。 To isolate the problem I mocked the calls to HttpClient.SendAsync with a method that does Task.Delay(300) and returns a mock result.为了隔离该问题,我使用执行 Task.Delay(300) 并返回模拟结果的方法模拟了对 HttpClient.SendAsync 的调用。 In this case running from the console behaves identically to running from Visual Studio.在这种情况下,从控制台运行的行为与从 Visual Studio 运行的行为相同。

I'm using IHttpClientFactory and I've even tried adjusting the connection limit on ServicePointManager.我正在使用 IHttpClientFactory,我什至尝试调整 ServicePointManager 上的连接限制。

Here's my registration code.这是我的注册码。

    public static IServiceCollection RegisterHttpClient(this IServiceCollection services, int batchSize)
    {
        ServicePointManager.DefaultConnectionLimit = batchSize;
        ServicePointManager.MaxServicePoints = batchSize;
        ServicePointManager.SetTcpKeepAlive(true, 1000, 5000);

        services.AddHttpClient(MSGraphRequestManager.HttpClientName, c =>
        {
            c.Timeout = TimeSpan.FromSeconds(360);
            c.DefaultRequestHeaders.Add("User-Agent", "xxxxxxxxxxxx");
        })
        .ConfigurePrimaryHttpMessageHandler(() => new DefaultHttpClientHandler(batchSize));

        return services;
    }

Here's the DefaultHttpClientHandler.这是 DefaultHttpClientHandler。

internal class DefaultHttpClientHandler : HttpClientHandler
{
    public DefaultHttpClientHandler(int maxConnections)
    {
        this.MaxConnectionsPerServer = maxConnections;
        this.UseProxy = false;
        this.AutomaticDecompression = System.Net.DecompressionMethods.GZip | System.Net.DecompressionMethods.Deflate;
    }
}

Here's the code that sets up the tasks.这是设置任务的代码。

        var timer = Stopwatch.StartNew();
        var tasks = new Task<(UpsertUserResult, TimeSpan)>[users.Length];
        for (var i = 0; i < users.Length; ++i)
        {
            tasks[i] = this.CreateUserAsync(users[i]);
        }

        var results = await Task.WhenAll(tasks);
        timer.Stop();

Here's how I mocked out the HttpClient.这是我模拟 HttpClient 的方法。

        var httpClient = this.httpClientFactory.CreateClient(HttpClientName);
        #if use_http
            using var response = await httpClient.SendAsync(request);
        #else
            await Task.Delay(300);
            var graphUser = new User { Id = "mockid" };
            using var response = new HttpResponseMessage(HttpStatusCode.OK) { Content = new StringContent(JsonConvert.SerializeObject(graphUser)) };
        #endif
        var responseContent = await response.Content.ReadAsStringAsync();

Here are metrics for 10k B2C users created via GraphAPI using 500 concurrent requests.以下是使用 500 个并发请求通过 GraphAPI 创建的 10k B2C 用户的指标。 The first 500 requests are longer than normal because the TCP connections are being created.前 500 个请求比正常情况长,因为正在创建 TCP 连接。

Here's a link to the console run metrics .这是控制台运行指标的链接。

Here's a link to the Visual Studio run metrics .这是Visual Studio 运行指标的链接。

The block times in the VS run metrics are different than what I said in this post because I moved all the synchronous file access to the end of the process in an effort to isolate the problematic code as much as possible for the test runs. VS 运行指标中的阻塞时间与我在这篇文章中所说的不同,因为我将所有同步文件访问移到了进程的末尾,以尽可能隔离有问题的代码以进行测试运行。

The project is compiled using .Net Core 3.1.该项目是使用 .Net Core 3.1 编译的。 I'm using Visual Studio 2019 16.4.5.我正在使用 Visual Studio 2019 16.4.5。

Two things come to mind.想到两件事。 Most microsoft powershell was written in version 1 and 2. Version 1 and 2 have System.Threading.Thread.ApartmentState of MTA.大多数 microsoft powershell 是在版本 1 和 2 中编写的。版本 1 和 2 具有 MTA 的 System.Threading.Thread.ApartmentState。 In version 3 through 5 the apartment state changed to STA by default.在版本 3 到 5 中,公寓状态默认更改为 STA。

The second thought is it sounds like they are using System.Threading.ThreadPool to manage the threads.第二个想法是听起来他们正在使用 System.Threading.ThreadPool 来管理线程。 How big is your threadpool?你的线程池有多大?

If those do not solve the issue start digging under System.Threading.如果这些不能解决问题,请在 System.Threading 下开始挖掘。

When I read your question I thought of this blog.当我读到你的问题时,我想到了这个博客。 https://devblogs.microsoft.com/oldnewthing/20170623-00/?p=96455 https://devblogs.microsoft.com/oldnewthing/20170623-00/?p=96455

A colleague demonstrated with a sample program that creates a thousand work items, each of which simulates a network call that takes 500ms to complete.一位同事演示了一个示例程序,该程序创建了一千个工作项,每个工作项都模拟一个需要 500 毫秒才能完成的网络调用。 In the first demonstration, the network calls were blocking synchronous calls, and the sample program limited the thread pool to ten threads in order to make the effect more apparent.在第一个演示中,网络调用是阻塞同步调用,示例程序将线程池限制为十个线程,以使效果更加明显。 Under this configuration, the first few work items were quickly dispatched to threads, but then the latency started to build as there were no more threads available to service new work items, so the remaining work items had to wait longer and longer for a thread to become available to service it.在这种配置下,前几个工作项被快速分派到线程,但随后延迟开始增加,因为没有更多线程可用于为新工作项提供服务,因此剩余的工作项必须等待越来越长的时间才能让线程可以为其提供服务。 The average latency to the start of the work item was over two minutes.工作项开始的平均延迟超过两分钟。

Update 1: I ran PowerShell 7.0 from the start menu and the thread state was STA.更新 1:我从开始菜单运行 PowerShell 7.0,线程状态为 STA。 Is the thread state different in the two versions?两个版本的线程状态不同吗?

PS C:\Program Files\PowerShell\7>  [System.Threading.Thread]::CurrentThread

ManagedThreadId    : 12
IsAlive            : True
IsBackground       : False
IsThreadPoolThread : False
Priority           : Normal
ThreadState        : Running
CurrentCulture     : en-US
CurrentUICulture   : en-US
ExecutionContext   : System.Threading.ExecutionContext
Name               : Pipeline Execution Thread
ApartmentState     : STA

Update 2: I wish better answer but, you will have compare the two environments till something stands out.更新 2:我希望得到更好的答案,但是,您将比较这两种环境,直到出现问题为止。

PS C:\Windows\system32> [System.Net.ServicePointManager].GetProperties() | select name

Name                               
----                               
SecurityProtocol                   
MaxServicePoints                   
DefaultConnectionLimit             
MaxServicePointIdleTime            
UseNagleAlgorithm                  
Expect100Continue                  
EnableDnsRoundRobin                
DnsRefreshTimeout                  
CertificatePolicy                  
ServerCertificateValidationCallback
ReusePort                          
CheckCertificateRevocationList     
EncryptionPolicy            

Update 3:更新 3:

https://docs.microsoft.com/en-us/uwp/api/windows.web.http.httpclient https://docs.microsoft.com/en-us/uwp/api/windows.web.http.httpclient

In addition, every HttpClient instance uses its own connection pool, isolating its requests from requests executed by other HttpClient instances.此外,每个 HttpClient 实例都使用自己的连接池,将其请求与其他 HttpClient 实例执行的请求隔离开来。

If an app using HttpClient and related classes in the Windows.Web.Http namespace downloads large amounts of data (50 megabytes or more), then the app should stream those downloads and not use the default buffering.如果使用 HttpClient 和 Windows.Web.Http 命名空间中的相关类的应用下载了大量数据(50 兆字节或更多),则该应用应流式传输这些下载,而不是使用默认缓冲。 If the default buffering is used the client memory usage will get very large, potentially resulting in reduced performance.如果使用默认缓冲,客户端内存使用量将变得非常大,可能会导致性能降低。

Just keep comparing the two environments and the issue should stand out只要继续比较这两种环境,问题就会突出

Add-Type -AssemblyName System.Net.Http
$client = New-Object -TypeName System.Net.Http.Httpclient
$client | format-list *

DefaultRequestHeaders        : {}
BaseAddress                  : 
Timeout                      : 00:01:40
MaxResponseContentBufferSize : 2147483647

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 Visual Studio 和直接从 exe 运行应用程序时,激活窗口行为是不同的 - Activating a window behavior is different when running the app from Visual Studio and directly from the exe 从Visual Studio运行C#项目的行为与可执行文件不同 - Running C# project from Visual Studio has different behavior from executables 运行程序时,用户控件看起来不同(Visual Studio 2015) - User controls looks different when running the program (Visual studio 2015) 安装后Visual Studio VSPackage行为不同 - Visual studio VSPackage behavior different after installation StartupObject 从命令行运行时的行为与修改 CSPROJ 文件时的行为不同 - StartupObject has different behavior when running from command line than when modifying CSPROJ file 从C#程序运行powershell脚本的行为与直接运行时的行为不同 - Running powershell script From C# program behave different than when running directly 编译时编辑时 Visual Studio 的行为 - Behavior of Visual Studio when editing whilst building 在 Visual Studio 中升级 nuget 时出现奇怪的行为 - Strange behavior when upgrading nuget in Visual Studio 从Visual Studio /代码运行时,PowerShell不会加载所有PSModulePaths - PowerShell does not load all PSModulePaths when running from Visual Studio / Code ASP.NET Core 6:当使用不同的用户打开 Visual Studio 时,httpclient 调用 API 时出错。 AuthenticationException:远程证书无效 - ASP.NET Core 6 : httpclient error calling API when open Visual Studio with different User. AuthenticationException: The remote certificate is invalid
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM