简体   繁体   English

在 postgres 中删除和插入的竞争条件

[英]Race condition for delete and insert in postgres

I am working on a node project where I have imported pg library for database operations.我正在开发一个节点项目,我在其中导入了用于数据库操作的 pg 库。 I have a Kafka queue from where I fetch events and store them in the database.我有一个 Kafka 队列,我从中获取事件并将它们存储在数据库中。 I am fetching orders from kafka and every time an order is updated a new event is generated, I need to delete the old order details and replace them with new ones.我正在从 kafka 获取订单,每次更新订单时都会生成一个新事件,我需要删除旧的订单详细信息并将其替换为新的。

Below is the code下面是代码

async function saveOrders(orders: Array<TransactionOnOrders>) {
  const client = await pool.connect()
 
  try {
    await client.query('BEGIN')
    if (orders.length) {
      const deleted = await deleteOrders(client, orders[0].orderId)
      logger.debug(`deleted rowCount ${deleted.rowCount} ${orders[0].orderId}`)
    }
    const queries = orders.map(ord => saveTransactionOnOrders(client, ord))
    await Promise.all(queries)
    await client.query('COMMIT')
  } catch (e) {
    await client.query('ROLLBACK')
    throw e
  } finally {
    client.release()
  }
}

The orders are getting updated very frequently and we are receiving lots of events creating a race condition leading to records not being deleted and insertion of extra records.订单更新非常频繁,我们收到了很多事件,造成了竞争条件,导致记录不被删除和插入额外的记录。 For eg: let's say we received an event for order123 and the transaction is in process, till the time it completes another event for order123 is received so the deletion query returns 0 rows affected and the insertion query inserts another row leading to 2 rows while there should be only one record present.例如:假设我们收到一个 order123 的事件并且交易正在进行中,直到它完成 order123 的另一个事件的时间被接收,因此删除查询返回 0 行受影响,插入查询插入另一行,导致 2 行。应该只有一个记录存在。

I have tried to change the isolation level that didn't work well and resulted in an error我试图更改无法正常工作并导致错误的隔离级别

await client.query('BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ')
await client.query('BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE')

Is there any mistake I am doing here or is there a better to handle the above situation?我在这里做错了什么还是有更好的方法来处理上述情况?

This may be easier if you were updating rows rather than deleting them and recreating them.如果您更新行而不是删除它们并重新创建它们,这可能会更容易。 In that situation you can rely on row locks preventing concurrent updates.在这种情况下,您可以依靠行锁来防止并发更新。

Use INSERT ... ON CONFLICT to “upsert” the incoming rows.使用INSERT ... ON CONFLICT来“插入”传入的行。 That is atomic and free from race conditions.这是原子的,没有竞争条件。

As others have suggested, the ideal option here is to use INSERT ... ON CONFLICT to do this atomically.正如其他人所建议的那样,这里的理想选择是使用INSERT ... ON CONFLICT以原子方式执行此操作。 I can't help with that without seeing the contents of deleteOrders and saveTransactionOrders .如果没有看到deleteOrderssaveTransactionOrders的内容,我就saveTransactionOrders

If that's not an option, you should use SERIALIZABLE as the isolation level.如果这不是一个选项,您应该使用SERIALIZABLE作为隔离级别。 You will then get some serialisation errors, but these are safe to retry.然后您会收到一些序列化错误,但可以安全地重试。 If you used @databases ( https://www.atdatabases.org/docs/pg-guide-transactions ) you could retry just by passing retrySerializationFailures: true :如果您使用了 @databases ( https://www.atdatabases.org/docs/pg-guide-transactions ),您只需传递retrySerializationFailures: true即可重试:

async function saveOrders(orders: Array<TransactionOnOrders>) {
  await pool.tx(async client => 
    if (orders.length) {
      const deleted = await deleteOrders(client, orders[0].orderId)
      logger.debug(`deleted rowCount ${deleted.rowCount} ${orders[0].orderId}`)
    }
    const queries = orders.map(ord => saveTransactionOnOrders(client, ord))
    await Promise.all(queries)
  }, {
    isolationLevel: IsolationLevel.SERIALIZABLE,
    retrySerializationFailures: true,
  })
}

@databases handles starting the transaction, and committing/rolling back when the async callback ends. @databases 处理启动事务,并在异步回调结束时提交/回滚。 It also retries on serialisation failures.它还重试序列化失败。

If you are dealing with a very high volume of events, you may encounter performance issues due to the high frequency of serialization failures and therefore retries.如果您正在处理大量事件,您可能会由于序列化失败的频率很高而遇到性能问题,因此会重试。

You can use a "lock" in node.js to ensure that only one process at a time updates a given set of orders.您可以在 node.js 中使用“锁”来确保一次只有一个进程更新一组给定的订单。 https://www.atdatabases.org/docs/lock should make this fairly simple to implement. https://www.atdatabases.org/docs/lock应该使实现起来相当简单。 This will only lock out concurrent transactions on a single node.js process though, so you still need the transaction handling to deal with multiple node.js processes.但这只会锁定单个 node.js 进程上的并发事务,因此您仍然需要事务处理来处理多个 node.js 进程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM