简体   繁体   English

协助优化以下 MySql 程序

[英]Assistance with optimizing below MySql procedure

MySql - innodb_version 5.7.33 MySql - innodb_version 5.7.33

I working on a stored procedure which will be called periodically (lets say once a month) to populate a table with list of string in one column and with static values in other column.我正在研究一个存储过程,该过程将被定期调用(比如每月一次),以用一列中的字符串列表和另一列中的静态值填充表。 The table also has桌子上还有

  • ID column (AUTO_INCREMENT) and ID 列 (AUTO_INCREMENT) 和
  • timestamp column (CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP)时间戳列(CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP)

The string is a concatenation of fixed characters plus integers (lets say 10).该字符串是固定字符加整数(比方说 10)的串联。 This integer has to be non repetitive random within the range.这个整数必须是范围内的非重复随机数。

CREATE DEFINER=`db`@`%` PROCEDURE `InsertRandom`(IN NumRows INT, IN MinVal INT, IN MaxVal INT)
BEGIN
    DECLARE i INT;
    DECLARE UniqueId INT(15);
    SET i = 1;
    START TRANSACTION;
    WHILE i <= NumRows DO
        SET UniqueId = concat('ABC', MinVal + CEIL(RAND() * (MaxVal - MinVal)));
        IF  NOT EXISTS (SELECT UNIQUE_ID FROM MY_TABLE WHERE UNIQUE_ID = UniqueId) THEN
            INSERT INTO MY_TABLE (`UNIQUE_ID`, `STATE`, `RANGE_ID`) VALUES (UniqueId, 'new', '100');
        END IF;
        SET i = i + 1;
    END WHILE;
    COMMIT;
END

The range (minVal & maxVal) will be 1 million for every procedure call.对于每个过程调用,范围(minVal 和 maxVal)将为 100 万。

For example,例如,

CALL InsertRandom(1000000, 10000000,11000000);

The table will be purged once in 5 months retaining 1 month of data, so we can assume there will be at about 5 million records at a point when this procedure will executed and the usage of select inside loop is not optimal so please suggest an alternative approach.该表将每 5 个月清除一次,保留 1 个月的数据,因此我们可以假设在执行此过程时将有大约 500 万条记录,并且 select 内部循环的使用不是最佳选择,因此请提出替代方案方法。

(From Comment:) (来自评论:)

The goal is to have a table with unique IDs within a given range.目标是拥有一个在给定范围内具有唯一 ID 的表。 These shouldn't be in sequence.这些不应该是顺序的。 The range is in minimum of a million at time and maximum of 10 mill.该范围一次最小为 100 万次,最大为 10 次。 Chunks of which will be loaded into server's memory for further processing.其中的块将被加载到服务器的内存中以供进一步处理。 I'm interested in options to populate this table efficiently.我对有效填充此表的选项感兴趣。

Your stored procedure will found a bottle neck trying to fill some holes.您的存储过程会发现一个试图填补一些漏洞的瓶颈。 What about to pre-generate the numbers starting MinVal and insert those in a random order, take a look:如何预先生成从 MinVal 开始的数字并以随机顺序插入这些数字,看看:

CREATE TABLE numbers(
    `UNIQUE_ID` INT,
    `STATE` VARCHAR(50),
    `RANGE_ID` VARCHAR(50)
);


DELIMITER $$
DROP PROCEDURE IF EXISTS InsertRandom $$
CREATE PROCEDURE InsertRandom (IN NumRows INT, IN MinVal INT)
BEGIN
    DECLARE i INT;
    DECLARE UniqueId INT;
    SET i = 0;
    
    CREATE TEMPORARY TABLE IF NOT EXISTS numbers_tmp(
        UNIQUE_ID int, 
        RAND_VAL double, 
        INDEX(RAND_VAL)
    );
    
    START TRANSACTION;
    
    WHILE i < NumRows DO
        INSERT INTO numbers_tmp(UNIQUE_ID,RAND_VAL) 
        VALUES (i + MinVal ,rand());
        SET i = i + 1;
    END WHILE;
    
    INSERT INTO numbers (`UNIQUE_ID`, `STATE`, `RANGE_ID`) 
    SELECT `UNIQUE_ID`, 'new', '100'
    FROM numbers_tmp
    ORDER BY RAND_VAL; /* <-- THIS  */
        
    COMMIT;
    
    DROP TEMPORARY TABLE numbers_tmp;
END $$
DELIMITER ;


CALL InsertRandom(100, 500);

mysql> select  * from numbers;
+-----------+-------+----------+
| UNIQUE_ID | STATE | RANGE_ID |
+-----------+-------+----------+
|       575 | new   | 100      |
|       523 | new   | 100      |
|       560 | new   | 100      |
|       537 | new   | 100      |
|       526 | new   | 100      |
|       549 | new   | 100      |
|       598 | new   | 100      |
|       552 | new   | 100      |
|       555 | new   | 100      |
|       581 | new   | 100      |
...     

At this point.在此刻。 maybe is a good idea just insert the values in your main table and order by some rand inserted number.也许是一个好主意,只需在主表中插入值并按一些 rand 插入数字排序。 I don't know.我不知道。

  1. Insert a bunch of random numbers into a table.将一堆随机数插入一个表中。
  2. Remove from that table any numbers that are currently in use.从该表中删除当前使用的所有数字。
  3. When you need a number, take the 'next' number from that table, then delete it.当您需要一个数字时,从该表中取出“下一个”数字,然后将其删除。
  4. Check each night to see if the table is 'nearly' empty.每天晚上检查桌子是否“几乎”是空的。 If so, rerun the above steps.如果是,请重新运行上述步骤。

Slightly different:稍微不一样:

Setup:设置:

CREATE TABLE Numbers (
    id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    number INT UNSIGNED NOT NULL
    PRIMARY KEY(id),
    UNIQUE(number)    -- avoids dups, see IGNORE below
) ENGINE = InnoDB;

Do this nightly, not monthly:每晚执行此操作,而不是每月执行一次:

INSERT IGNORE INTO Numbers (number)
    SELECT FLOOR(9000000 * RAND() + 1000000)  -- for range of 1M to 10M
              AS number
        FROM MainTable   -- any table with lots of rows
        HAVING NOT EXISTS( SELECT 1 FROM MainTable
                WHERE Numbers.number = MainTable.number )
        LIMIT 12345;  -- enough to last 3 days, not a month

To get a new, unique, number (pseudo code):要获得一个新的、唯一的number (伪代码):

BEGIN;
$number = SELECT number FROM Numbers LIMIT 1 FOR UPDATE;
DELETE FROM Numbers WHERE number = $number;
COMMIT;

Then proceed to use $number in MainTable.然后继续在 MainTable 中使用 $number。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM