简体   繁体   English

用于电子邮件处理的同一表的 Java 并发查询

[英]Java Concurrent Queries to Same Table for Email Processing

I have a MySQL table that acts as an email queue - holding all records that need to be sent.我有一个用作电子邮件队列的 MySQL 表 - 保存需要发送的所有记录。 I am trying to perform the sending of each email with multiple threads.我正在尝试使用多个线程执行每封电子邮件的发送。 Each thread has to make a query to this email queue table to grab a set of records that will then be sent and deleted from the table.每个线程都必须查询此电子邮件队列表以获取一组记录,然后将这些记录从表中发送和删除。

How do you decide what records each thread will grab from the table?您如何决定每个线程将从表中获取哪些记录? From there, how do you manage these concurrent queries?从那里开始,您如何管理这些并发查询? I am using Java Spring Boot with Hibernate.我在 Hibernate 中使用 Java Spring Boot。

I would imagine something like below.我会想象像下面这样的东西。 This is done using id of a row in database.这是使用数据库中一行的 id 完成的。 Not really a good solution if you have huge id gaps.如果您有巨大的 id 差距,这不是一个很好的解决方案。 You can refactor this to use some date column or any other data that could help to batch records.您可以重构它以使用某些日期列或任何其他有助于批处理记录的数据。

10 - number of threads 10 - 线程数

i - number of current thread we are iterating through i - 我们正在迭代的当前线程数

10000 - batch size, used to get batch of 10000 emails 10000 - 批量大小,用于批量获取 10000 封电子邮件

counter - variable to know which thread should be responsible for which id batch counter - 知道哪个线程应该负责哪个 id 批次的变量

maximumEmailId - maximum id of email in emails table maximumEmailId - 电子邮件表中电子邮件的最大 id

 1. Create 10 numbered threads - 0, 1, 2..., 9 2. Start every thread 3. For each thread number (i): 4. For counter = 0, step 10000 5. if (counter / 10000) % 10 == i then - SELECT * FROM emails WHERE id BETWEEN (counter) AND (counter + 10000) - Send emails 6. if counter > maximumEmailId then break;

It will behave like this:它的行为如下:

 iteration 0: -thread 0 - counter = 0 - select ... where id between 0 and 10000 -thread 1 - counter = 10000 - select ... where id between 10000 and 20000 -thread 9 - counter = 90000 - select ... where id between 90000 and 100000 iteration 1: -thread 0 - counter = 100000 - select ... where id between 100000 and 11000 -thread 1 - counter = 110000 - select ... where id between 110000 and 12000 -thread 9 - counter = 190000 - select ... where id between 190000 and 20000

Basically, this solution has nothing to do with locking, technical concurrent tricks etc., you just divide your dataset so no-one tries to read the same batch.基本上,这个解决方案与锁定、技术并发技巧等无关,您只需划分数据集,这样就没有人试图读取同一个批次。 Just imagine 100 boxes on the ground, thread 0 takes boxes numbered 0, 10, 20,...,90 , thread 1 boxes 1, 11, 21,...,91 etc.想象一下地面上有 100 个盒子,线程0需要编号为0, 10, 20,...,90盒子,线程1盒子为1, 11, 21,...,91等等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM