简体   繁体   English

如何从SQL查询选择中选择随机行?

[英]How can I select a random row from a SQL query selection?

How can I select a row at random from a SQL database query? 如何从SQL数据库查询中随机选择一行? By this I mean: 我的意思是:

Select all things with Category 'green' from table1: 从表1中选择类别为“绿色”的所有内容:

$stmt = $db->query('SELECT * from table1 WHERE Category LIKE "%green%"');

Then randomly display a row from this selection (as opposed to displaying every row from this selection as I've done below) 然后随机显示该选择中的一行(与我下面所做的显示该选择中的每一行相反)

while($rows = $stmt->fetch()){
     echo "<tr><td>". $rows['Number'] . "</td><td>" . $rows['Content'] . "</td></tr>";
};

In a reasonably-sized data set, order your rows randomly and select the first one: 在大小合理的数据集中,随机排列行并选择第一个:

...ORDER BY RAND() LIMIT 1;

Your statement will become: 您的声明将变为:

$stmt = $db->query(
    'SELECT * from table1
     WHERE Category LIKE "%green%"
     ORDER BY RAND() LIMIT 1;'
);

If you narrow your selection in your query, you will not need to use a messy process to extract a single row from the result set in your PHP code. 如果您在查询中缩小选择范围,则无需使用混乱的过程即可从PHP代码的结果集中提取一行。

If your data set is very large, consider executing multiple queries as recommended by Tobias Hagenbeek: 如果您的数据集非常大,请考虑按照Tobias Hagenbeek的建议执行多个查询:

  1. COUNT() the matching rows. COUNT()匹配的行。
  2. In PHP, select a random number between 1 and the result of COUNT() . 在PHP中,选择一个介于1和COUNT()结果之间的随机数。
  3. Perform new query to select the specified row: 执行新查询以选择指定的行:

    ...LIMIT <random number>, 1;

Finally, if you need only a single, arbitrary row and randomness/uniqueness is not an issue, consider selecting the first row from the table every time as suggested by Gordon Linoff: 最后,如果只需要一个任意的行,并且随机性/唯一性不是问题,请考虑按照Gordon Linoff的建议每次从表中选择第一行:

...LIMIT 1;

The easy way is this: 简单的方法是这样的:

$stmt = $db->query('SELECT * from table1 WHERE Category LIKE "%green%" ORDER BY RAND()');

ORDER BY RAND() will order the results randomly, but it is a bit expensive as an operation ( link ). ORDER BY RAND()会随机排序结果,但作为操作( link )有点贵。

If you care about that sort of thing, you can alternatively query for the number of rows in the table, then do $r = rand(0, $count-1) , then LIMIT 1 OFFSET $r at the end of your query. 如果您关心这种事情,则可以查询表中的行$r = rand(0, $count-1) ,然后在查询结束时执行$r = rand(0, $count-1) ,然后执行LIMIT 1 OFFSET $r

You could use ORDER BY RAND(), but you should be weary to do so. 您可以使用ORDER BY RAND(),但您应该对此感到厌倦。 Especially if you are talking large systems, and more then 10k rows. 特别是如果您正在谈论大型系统,并且超过1万行。

Here's why... 这就是为什么

What happens when you run such a query? 当您运行这样的查询时会发生什么? Let's say you run this query on a table with 10000 rows, than the SQL server generates 10000 random numbers, scans this numbers for the smallest one and gives you this row. 假设您在具有10000行的表上运行此查询,然后SQL Server会生成10000个随机数,然后扫描此数字中最小的一个,然后为您提供此行。 Generating random numbers is relatively expensive operation, scaning them for the lowest one (if you have LIMIT 10, it will need to find 10 smallest numbers) is also not so fast (if quote is text it's slower, if it's something with fixed size it is faster, maybe because of need to create temporary table). 生成随机数是相对昂贵的操作,将它们扫描为最低的数字(如果您有LIMIT 10,则需要找到10个最小的数字)也不是那么快(如果引用是文本,则速度会变慢,如果它是固定大小的内容,速度更快,可能是因为需要创建临时表)。

So what you should do is a count on your rows, take a random number between 0 and your count-1, then do SELECT column FROM table LIMIT $generated_number, 1 因此,您应该对行进行计数,从0到count-1之间取一个随机数,然后从表LIMIT $ generation_number,1中进行SELECT列

If you want only one row from the set of all rows, then the fastest method is simply: 如果您只希望所有行集中的一行,那么最快的方法就是:

 SELECT *
 from table1
 WHERE Category LIKE "%green%"
 LIMIT 1;

This will give you the first row encountered in the data. 这将为您提供数据中遇到的第一行。 To a close approximation, this is the first row inserted into the table that matches your criteria. 近似而言,这是插入表中符合条件的第一行。 (This is not a guarantee. For instance, deletes could definitely change this.) (这不是保证。例如,删除肯定会改变它。)

This has the advantage of being fast'ish, which is useful because an index will not benefit you on the where clause. 这具有快速的优势,这很有用,因为索引不会在where子句中使您受益。 In this case, the query does a full table scan but stops at the first match. 在这种情况下,查询将执行全表扫描,但在第一个匹配项时停止。

The alternative for a truly random row is to use rand() : 真正随机行的替代方法是使用rand()

SELECT *
from table1
WHERE Category LIKE "%green%"
order by rand()
limit 1;

This requires a full table scan that doesn't stop because all matches are needed for the sort. 这需要不停止全表扫描,因为排序需要所有匹配项。 You then have the additional overhead of sorting the subset by rand(). 然后,您还有额外的开销,需要按rand()排序子集。 There are some alternatives, if performance really is an issue. 如果性能确实是一个问题,则有一些替代方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM