简体   繁体   English

MySQL从300行中快速选择20条随机行

[英]MySQL select 20 random rows from 300 rows fast

My database has 300 rows at the moment and propably will increse to about 5000 rows during the next years. 我的数据库目前有300行,并且未来几年可能会增加到约5000行。 I want to know how I can select 20 rows randomly best. 我想知道如何最好地随机选择20行。

I found here MySQL select 10 random rows from 600K rows fast (where it is again refered to http://jan.kneschke.de/projects/mysql/order-by-rand/ ) that the following code produces a random selection very fast: 我在这里发现MySQL从600K行中快速选择了10个随机行 (这里再次引用到http://jan.kneschke.de/projects/mysql/order-by-rand/ ),以下代码可以非常快速地产生随机选择:

SELECT name
  FROM random AS r1 JOIN
       (SELECT (RAND() *
                     (SELECT MAX(id)
                        FROM random)) AS id)
        AS r2
 WHERE r1.id >= r2.id
 ORDER BY r1.id ASC
 LIMIT 1

so in php I tried the following to get 20 rows: 因此,在php中,我尝试了以下操作以获得20行:

$anfrage    =   "SELECT name
  FROM random AS r1 JOIN
       (SELECT (RAND() *
                     (SELECT MAX(id)
                        FROM random)) AS id)
        AS r2
 WHERE r1.id >= r2.id
 ORDER BY r1.id ASC
 LIMIT 20";

 $ergebnis=$db->query($anfrage)
        or die($db->error);
 while($zeile=mysqli_fetch_assoc($ergebnis))print_r($zeile);

But when I run the script I wont get 20 rows most of the time. 但是,当我运行脚本时,大部分时间我不会得到20行。 Actually, the probability to pick 20 different rows out of 300 is about 48,8%. 实际上,从300个中选择20个不同的行的概率约为48.8%。

Can I change the above code to get really 20 rows very quick? 我可以更改上面的代码以非常快地获得20行吗?

If you read the article you mention in your question, you would find out that there are 3 solutions: 如果阅读您在问题中提到的文章 ,您会发现有3种解决方案:

  • execute the Query several times 多次执行查询
  • write a stored procedure which is executing the query and stores the result in a temp-table 编写执行查询的存储过程,并将结果存储在临时表中
  • make a UNION 做一个联盟

All of them are explained in the article. 所有这些都在文章中进行了解释。

The "slow" way of getting 20 random names is this: 获取20个随机名称的“缓慢”方式是这样的:

SELECT name
FROM random 
ORDER BY rand()
LIMIT 20;

On 300 rows, this might have similar performance to the method that you are using. 在300行上,这可能与您使用的方法具有相似的性能。 Have you tried it? 你试过了吗? I'm not sure about 5,000 rows, but it is worth trying there as well. 我不确定大约5,000行,但是在那里也值得尝试。

Your method is essentially this (the query is a bit simplified): 您的方法本质上是这样(查询有点简化):

SELECT name
FROM random r1 CROSS JOIN
     (SELECT RAND() * MAX(id) as id FROM random) r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 20;

You are assuming that r2 is evaluated for each iteration with a different value. 您假设对每个迭代使用不同的值评估r2 That assumption may not be true. 该假设可能不正确。

Another approach is to do this: 另一种方法是这样做:

SELECT name
FROM random r1 CROSS JOIN
     (SELECT count(*) as cnt FROM random) const
WHERE rand() <= 20.0 / cnt;

Unfortunately, this gives an approximate number of rows. 不幸的是,这给出了大约的行数。 About 20 each time. 每次大约20次。 Maybe you really want 20. In that case, do something like doubling the expected number of rows and then using order by / limit : 也许您真的想要20。在这种情况下,请执行以下操作:将预期的行数加倍,然后使用/ order by limit进行order by

SELECT name
FROM random r1 CROSS JOIN
     (SELECT count(*) as cnt FROM random) const
WHERE rand() <= 2*20.0 / cnt
ORDER BY rand()
LIMIT 20;

You could create a shuffled table that you update occasionally: 您可以创建一个随机整理的表,您偶尔可以对其进行更新:

INSERT INTO random_ids 
SELECT id 
FROM table_name
ORDER BY RAND();

Record the number of random values that were inserted in your application; 记录在您的应用程序中插入的随机值的数量; then use the following: 然后使用以下命令:

SELECT * FROM table_name
INNER JOIN (SELECT id 
    FROM random_ids
    LIMIT ?,20
) r1 ON r1.id = table_name.id;

Whereby the limit is determined by your application to be within the range of [0, <count>) 因此,限制由您的应用程序确定为在[0, <count>)的范围内

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM