简体繁体 English

使用数据库作为任务队列

[英]use database as a queue of tasks

原文 2018-04-29 15:13:25 3 2 java/ postgresql

In one of our java applications (based on postgresql db), we have a database table that maintains a list of tasks to be executed. 在我们的一个Java应用程序（基于postgresql db）中，我们有一个数据库表，该表维护着要执行的任务列表。 Each row has a json blob for the details of a task as well as scheduled time value. 每行都有一个json blob，用于显示任务的详细信息以及计划的时间值。

We have a few java workers/threads whose jobs are to search for tasks that are ready for execution (based on its schedule value), execute and delete them from the table. 我们有一些Java工作程序/线程，其工作是搜索准备好执行的任务（基于其计划值），执行该任务并将其从表中删除。 Execution of a task may take a few seconds. 执行任务可能需要几秒钟。

The problem is, more than one worker may grab the same row, causing duplicate execution of a task, which is something we want to avoid. 问题是，一个以上的工人可能会抓住同一行，从而导致重复执行任务，这是我们要避免的事情。

One approach is, when doing select to grab a row, do it with FOR UPDATE to lock the row, supposedly preventing other worker from grabbing the same row that's locked. 一种方法是，在执行select抓取一行时，使用FOR UPDATE来锁定该行，以防止其他工作程序抓取被锁定的同一行。

My concern with this approach is, the row is only locked when the select transaction is being executed in the db (according to this ), while the java code is actually executing the row/task that's selected, the locking has gone, another worker can grab it again. 我对这种方法的担心是，仅当在db中执行select事务时，行才被锁定（根据this ），而Java代码实际上正在执行所选的行/任务时，锁定已消失，另一位工作者可以再次抓住它。

Can some shed some light on whether the above approach is going to work for sure? 是否可以确定上述方法是否一定行得通？ Thanks! 谢谢！

2 个解决方案

Treat the DB calls as atomic instructions and design lock free algos around your table, using updates to change a boolean column "in-progress" from false to true. 将DB调用视为原子指令，并使用更新将布尔值列“进行中”的布尔值从false更改为true，从而在表周围设计无锁算法。 Could also just be a state int (0=avail, 1=inprogress, N=resultcode). 也可能只是一个state int （0 =可用，1 =进行中，N =结果代码）。

Make sure you have a partial index on state 0 (and possibly 1 to recover from crashes to find tasks in progress), so that the ...where state=0 remains selective and fast (on top of the scheduled time index of course). 确保在状态0上有部分索引（可能从崩溃中恢复以找到正在执行的任务的状态为1），以便...where state=0保持选择性和快速（当然是在计划的时间索引之上）。

Hope this helps. 希望这可以帮助。

When one thread has successfully locked the row on a given connection, another one attempting to obtain a lock on the row on a different connection should fail. 当一个线程已成功锁定给定连接上的行时，另一线程尝试获得另一连接上的行上的锁应该失败。 You should issue the select-for-update with some kind of no-wait clause to request immediate failure if the row is locked. 如果该行被锁定，则应发出带有某种no-wait子句的select-for-update，以请求立即失败。

Now, this doesn't solve the query vs lock race, as a failed lock may interrupt a thread's execution. 现在，这不能解决查询与锁竞争，因为失败的锁可能会中断线程的执行。 You can solve that by (in each execution): 您可以通过以下方式解决此问题：

Select all records with new tasks (regardless of whether they're being processed or not) 选择带有新任务的所有记录（无论是否正在处理它们）
For each new task returned in [1], run a matching select-for-update, then continue with processing the task if the lock fails. 对于[1]中返回的每个新任务，运行匹配的select-for-update，如果锁定失败，则继续处理任务。
If any lock attempt fails, skip the task without failing the entire process. 如果任何锁定尝试失败，请跳过该任务而不会使整个过程失败。