简体   繁体   English

使用 PostgreSQL / Npgsql 客户端最大化并发请求处理

[英]Maximising concurrent request handling with PostgreSQL / Npgsql client

I have a db and client app that does reads and writes, I need to handle a lot of concurrent reads but be sure that writes get priority, while also respecting my db's connection limit.我有一个读取和写入的数据库和客户端应用程序,我需要处理大量并发读取,但要确保写入优先,同时还要遵守我的数据库的连接限制。

Long version:长版:
I have a single instance pgSQL database which allows 100 connections.我有一个允许 100 个连接的单个实例 pgSQL 数据库。 My .net microservice uses Npgsql to connect to the db.我的 .net 微服务使用 Npgsql 连接到数据库。 It has to do read queries that can take 20-2000ms and writes that can take about 500-2000ms.它必须执行可能需要 20-2000 毫秒的读取查询和可能需要大约 500-2000 毫秒的写入查询。 Right now there are 2 instances of the app, connecting with the same user credentials.现在有 2 个应用程序实例,使用相同的用户凭据连接。 I am trusting Npgsql to manage my connection pooling, and am preparing my read queries as there are basically just 2 or 3 variants with different parameter values.我信任 Npgsql 来管理我的连接池,并且正在准备我的读取查询,因为基本上只有 2 或 3 个具有不同参数值的变体。

As user requests increased, I started having problems with the database's connection limit.随着用户请求的增加,我开始遇到数据库连接限制的问题。 Errors like 'Too many connections' from the db.来自数据库的“连接太多”之类的错误。

To deal with this I introduced a simple gate system in my repo class:为了解决这个问题,我在我的 repo class 中引入了一个简单的门系统:

private static readonly SemaphoreSlim _writeGate = new(20, 20);
private static readonly SemaphoreSlim _readGate = new(25, 25);

public async Task<IEnumerable<SomeDataItem>> ReadData(string query, CancellationToken ct)
{
   await _readGate.WaitAsync(ct);
   // try to get data, finally release the gate
   _readGate.Release();
}

public async Task WriteData(IEnumerable<SomeDataItem>, CancellationToken ct)
{
   await _writeGate.WaitAsync(ct);
   // try to write data, finally release the gate
   _writeGate.Release();
}

I chose to have separate gates for read and write because I wanted to be confident that reads would not get completely blocked by concurrent writes.我选择为读取和写入设置单独的门,因为我想确信读取不会被并发写入完全阻塞。 The limits are hardcoded as above, a total of limit of 45 on each of the 2 app instances, connecting to 1 db server instance.限制是硬编码的,连接到 1 个数据库服务器实例的 2 个应用程序实例中的每一个的总限制为 45。 It is more important that attempts to write data do not fail than attempts to read.写入数据的尝试不会失败比读取数据的尝试更重要。 I have some further safety here with a Polly retry pattern.我在这里使用 Polly 重试模式进一步确保安全。

This was alright for a while, but as the concurrent read requests increase, I see that the response times start to degrade, as a backlog of read requests begins to accumulate.这暂时还好,但随着并发读取请求的增加,我看到响应时间开始下降,因为读取请求的积压开始累积。

So, for this question, assume my sql queries and db schema are optimized to the max, what can I do to improve my throughput?所以,对于这个问题,假设我的 sql 查询和数据库模式被优化到最大,我可以做些什么来提高我的吞吐量?

I know that there are times when my _readGate is maxed out, but there is free capacity in the _writeGate.我知道有时我的 _readGate 已用完,但 _writeGate 中有空闲容量。 However I don't dare reduce the hardcoded limits because at other times I need to support concurrent writes.但是我不敢降低硬编码限制,因为在其他时候我需要支持并发写入。 So I need some kind of QoS solution that can allow more concurrent reads when possible, but will give priority to writes when needed.所以我需要某种 QoS 解决方案,它可以在可能的情况下允许更多并发读取,但在需要时会优先写入。

Queue management is pretty complicated to me but is also quite well known to many, so is there a good nuget package that can help me out?队列管理对我来说很复杂,但也为很多人所熟知,所以有没有好的 nuget package 可以帮助我? (I'm not even sure what to google) (我什至不确定要谷歌什么)
Is there a simple change to my code to improve on what I have above?我的代码是否有简单的更改以改进我上面的内容?
Would it help to have different conn strings / users for reads vs writes?使用不同的 conn 字符串/用户进行读取和写入会有帮助吗?
Anything else I can do with npgsql / connection string that can improve things?我还能用 npgsql / 连接字符串做些什么来改善事情吗?

I think that postgresql recommends limiting connections to 100, there's a SO thread on this here: How to increase the max connections in postgres?我认为 postgresql 建议将连接限制为 100,这里有一个 SO 线程: How to increase the max connections in postgres? There's always a limit to how many simultaneous queries that you can run before the perf would stop improving and eventually drop off.在 perf 停止改进并最终下降之前,您可以同时运行多少个查询总是有限制的。 However I can see in my azure telemetry that my db server is not coming close to fully using cpu, ram or disk IO (cpu doesn't exceed 70% and is often less, memory the same, and IOPS under 30% of its capacity) so I believe there is more to be squeezed out somewhere:)然而,我可以在我的 azure 遥测中看到我的数据库服务器没有接近完全使用 cpu、ram 或磁盘 IO(cpu 不超过 70%,而且通常更少,memory 相同,IOPS 低于其容量的 30% ) 所以我相信还有更多的地方可以挤出:)

Maybe there are other places to investigate, but for the sake of this question I'd just like to focus on how to better manage connections.也许还有其他地方需要调查,但为了这个问题,我只想关注如何更好地管理连接。

First, if you're getting "Too many connections" on the PostgreSQL side, that means that the total number of physical connections being opened by Npgsql exceeds the max_connection setting in PG.首先,如果您在 PostgreSQL 端收到“太多连接”,这意味着 Npgsql 打开的物理连接总数超过了 PG 中的max_connection设置。 You need to make sure that the aggregate total of Npgsql's Max Pool Size across all app instances doesn't exceed that, so if your max_connection is 100 and you have two Npgsql instances, each needs to run with Max Pool Size=50 .您需要确保所有应用程序实例中 Npgsql 的Max Pool Size的总和不超过该值,因此如果您的max_connection为 100 并且您有两个 Npgsql 实例,则每个实例都需要使用Max Pool Size=50运行。

Second, you can indeed have different connection pools for reads vs. writes, by having different connection strings (a good trick for that is to set the Application Name to different values).其次,通过使用不同的连接字符串,您确实可以为读取和写入使用不同的连接池(一个好的技巧是将Application Name设置为不同的值)。 However, you may want to set up one or more read replicas (primary/secondary setup);但是,您可能希望设置一个或多个只读副本(主要/次要设置); this would allow all read workload to be directed to the read replica(s), while keeping the primary for write operation only.这将允许将所有读取工作负载定向到只读副本,同时保留主副本仅用于写操作。 This is a good load balancing technique, and Npgsql 6.0 has introduced great support for it ( https://www.npgsql.org/doc/failover-and-load-balancing.html ).这是一个很好的负载平衡技术,Npgsql 6.0 引入了对它的强大支持( https://www.npgsql.org/doc/failover-and-load-balancing.html )。

Apart from that, you can definitely experiment with increasing max_connection on the PG side - and accordingly Max Pool Size on the clients' side - and load-test what this do to resource utilization.除此之外,您绝对可以尝试在 PG 端增加max_connection - 并相应地在客户端增加Max Pool Size - 并负载测试这对资源利用率的影响。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM