简体繁体 English

如何在同时接收具有相同ID的多个请求时保持API幂等？

[英]How to keep an API idempotent while receiving multiple requests with the same id at the same time?

原文 2015-07-25 01:36:24 5 2 api/ rest/ http/ idempotent

From a lot of articles and commercial API I saw, most people make their APIs idempotent by asking the client to provide a requestId or idempotent-key (eg https://www.masteringmodernpayments.com/blog/idempotent-stripe-requests ) and basically store the requestId <-> response map in the storage. 从我看到的很多文章和商业API中，大多数人通过要求客户提供requestId或idempotent-key（例如https://www.masteringmodernpayments.com/blog/idempotent-stripe-requests ）来使其API具有幂等性。基本上将requestId < - >响应映射存储在存储中。 So if there's a request coming in which already is in this map, the application would just return the stored response. 因此，如果有一个已经存在于此映射中的请求，则应用程序将仅返回存储的响应。

This is all good to me but my problem is how do I handle the case where the second call coming in while the first call is still in progress? 这对我来说都很好，但我的问题是如何处理第一次通话仍在进行中第二次通话的情况？

So here is my questions 所以这是我的问题

I guess the ideal behaviour would be the second call keep waiting until the first call finishes and returns the first call's response? 我想理想的行为是第二个呼叫一直等到第一个呼叫结束并返回第一个呼叫的响应？ Is this how people doing it? 人们这样做是怎么回事？
if yes, how long should the second call wait for the first call to be finished? 如果是，第二个呼叫等待第一个呼叫完成需要多长时间？
if the second call has a wait time limit and the first call still hasn't finished, what should it tell the client? 如果第二个呼叫有等待时间限制且第一个呼叫仍未完成，它应该告诉客户端什么？ Should it just not return any responses so the client will timeout and retry again? 它是否应该不返回任何响应，以便客户端超时并再次重试？

2 个解决方案

For wunderlist we use database constraints to make sure that no request id (which is a column in every one of our tables) is ever used twice. 对于wunderlist，我们使用数据库约束来确保没有请求id（我们每个表中的列）都被使用过两次。 Since our database technology (postgres) guarantees that it would be impossible for two records to be inserted that violate this constraint, we only need to react to the potential insertion error properly. 由于我们的数据库技术（postgres）保证插入两个违反此约束的记录是不可能的，因此我们只需要正确地对潜在的插入错误做出反应。 Basically, we outsource this detail to our datastore. 基本上，我们将此详细信息外包给我们的数据存储区。

I would recommend, no matter how you go about this, to try not to need to coordinate in your application. 无论你如何处理这个问题，我都会建议您不要在应用程序中进行协调。 If you try to know if two things are happening at once then there is a high likelihood that there would be bugs. 如果你试图知道是否一次发生了两件事，那么很可能会出现错误。 Instead, there might be a system you already use which can make the guarantees you need. 相反，您可能已经使用了一个可以提供所需保证的系统。

Now, to specifically address your three questions: 现在，专门解决你的三个问题：

For us, since we use database constraints, the database handles making things queue up and wait. 对我们来说，由于我们使用数据库约束，数据库会处理事务排队等待。 This is why I personally prefer the old SQL databases - not for the SQL or relations, but because they are really good at locking and queuing. 这就是为什么我个人更喜欢旧的SQL数据库 - 不是因为SQL或关系，而是因为它们非常擅长锁定和排队。 We use SQL databases as dumb disconnected tables. 我们使用SQL数据库作为哑的断开表。
This depends a lot on your system. 这在很大程度上取决于您的系统。 We try to tune all of our timeouts to around 1s in each system and subsystem. 我们尝试将每个系统和子系统中的所有超时调整为大约1秒。 We'd rather fail fast than queue up. 我们宁愿失败而不是排队。 You can measure and then look at your 99th percentile for timings and just set that as your timeout if you don't know ahead of time. 您可以测量然后查看您的第99百分位数的时间，如果您不提前知道，只需将其设置为超时。
We would return a 504 http status (and appropriate response body) to the client. 我们将向客户端返回504 http状态（和适当的响应正文）。 The reason for having a idempotent-key is so the client can retry a request - so we are never worried about timing out and letting them do just that. 拥有幂等密钥的原因是客户端可以重试请求 - 所以我们从不担心超时并让他们这样做。 Again, we'd rather timeout fast and fix the problems than to let things queue up. 同样，我们宁愿快速超时并解决问题而不是让事情排队。 If things queue up then even after something is fixed one has to wait a while for things to get better. 如果事情排队，那么即使修复了某些东西，也必须等待一段时间才能让事情变得更好。

It's a bit hard to understand if the second call is from the same client with the same request token, or a different client. 如果第二个呼叫来自具有相同请求令牌的同一客户端或不同的客户端，则有点难以理解。

Normally in the case of concurrent requests from different clients operating on the same resource, you would also want to implementing a versioning strategy alongside a request token for idempotency. 通常，在来自同一资源上运行的不同客户端的并发请求的情况下，您还需要实现版本控制策略以及用于幂等性的请求令牌。

A typical version strategy in a relational database might be a version column with a trigger that auto increments the number each time a record is updated. 关系数据库中的典型版本策略可能是具有触发器的版本列，每次更新记录时该触发器会自动递增该数字。

With this in place, all clients must specify their request token as well as the version they are updating (typical the IfMatch header is used for this and the version number is used as the value of the ETag). 有了这个，所有客户端必须指定它们的请求令牌以及它们正在更新的版本（典型的IfMatch头用于此，版本号用作ETag的值）。

On the server side, when it comes time to update the state of the resource, you first check that the version number in the database matches the supplied version in the ETag. 在服务器端，当需要更新资源状态时，首先要检查数据库中的版本号是否与ETag中提供的版本匹配。 If they do, you write the changes and the version increments. 如果是，则编写更改和版本增量。 Assuming the second request was operating on the same version number as the first, it would then fail with a 412 (or 409 depending on how you interpret HTTP specifications) and the client should not retry. 假设第二个请求在与第一个请求相同的版本号上运行，那么它将以412（或409取决于您如何解释HTTP规范）失败，并且客户端不应该重试。

If you really want to stop the second request immediately while the first request is in progress, you are going down the route of pessimistic locking, which doesn't suit REST API's that well. 如果您真的想在第一个请求正在进行时立即停止第二个请求，那么您将走下悲观锁定的路径，这不适合REST API。

In the case where you are actually talking about the client retrying with the same request token because it received a transient network error, it's almost the same case. 如果您实际上是在谈论客户端使用相同的请求令牌重试，因为它收到了一个瞬态网络错误，那几乎是相同的情况。

Both requests will be running at the same time, the second request will start because the first request still has not finished and has not recorded the request token to the database yet, but whichever one ends up finishing first will succeed and record the request token. 这两个请求将同时运行，第二个请求将启动，因为第一个请求仍然没有完成，并且还没有将请求令牌记录到数据库中，但无论哪个最终完成都将成功并记录请求令牌。

For the other request, it will receive a version conflict (since the first request has incremented the version) at which point it should recheck the request token database table, find it's own token in there and assume that it was a concurrent request that finished before it did and return 200. 对于其他请求，它将收到版本冲突（因为第一个请求增加了版本），此时它应该重新检查请求令牌数据库表，在那里找到它自己的令牌并假设它是一个之前完成的并发请求它做了，返回200。

It's seems like a lot, but if you want to cover all the weird and wonderful failure modes when your dealing with REST, idempotency and concurrency this is way to deal with it. 这似乎很多，但如果你想要处理REST，幂等性和并发性时所有奇怪和奇妙的失败模式，这就是处理它的方法。