简体   繁体   中英

MySQL RAND() how often can it be used? does it use /dev/random?

I have a table with few rows (tops 50), I need to get random value out of table I can do that by
ORDER BY RAND() LIMIT 1
Main question is in the point when I have 6k selects in 5 seconds is rand stil 'reliable'? How is rand calculated, can I seed it over time? (idk, every 5 seconds).

The MySQL pseudo-random number generator is completely deterministic. The docs say:

RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.

It can't use /dev/random because MySQL is designed to work on a variety of operating systems, some of which don't have a /dev/random.

MySQL initializes a default seed at server startup, using the integer returned by time(0) . If you're interested in the source line, it's in the MySQL source in file sql/mysqld.cc, function init_server_components() . I don't think it ever re-seeds itself.

Then the subsequent "random" numbers are based solely on the seed. See source file mysys_ssl/my_rnd.cc, function my_rnd() .


The best practice solution to your random-selection task, for both performance and quality of randomization, is to generate a random value between the minimum primary key value and maximum primary key value. Then use that random value to select a primary key in your table:

SELECT ... FROM MyTable WHERE id > $random LIMIT 1

The reason you'd use > instead of = is that you might have gaps in the id due to rows being deleted or rolled back, or you might have other conditions in your WHERE clause so that you have gaps in between rows that match your conditions.

The disadvantages of this greater-than method:

  • Rows following such a gap have a higher chance of being chosen, and the larger the gap the greater the chance.
  • You need to know the MIN(id) and MAX(id) before you generate the random value.
  • Doesn't work as well if you need more than one random row.

Advantages of this method:

  • It's much faster than ORDER BY RAND(), even for a modest table size.
  • You can use a random function outside of SQL.

RAND is pseudorandom. Be careful using it for security stuff. I don't think your "choose one row randomly out of fifty" is for security, so you're probably OK.

It's pretty fast for a small table. It will be horrible for picking a random row out of a large table: it will has to tag every row with a pseudorandom number and then sort them. For the application you're describing, @TheEwook's suggestion is exactly right; sorting even a small table more often than once a millisecond can swamp even powerful MySQL hardware.

Don't seed RAND, ever, unless you're testing and you want a repeatable sequence of random numbers for some kind of unit test. I learned this the hard way once when generating what I thought were hard-to-guess session tokens. The MySQL guys did a good job with RAND and you can trust them for the application you're talking about.

I think (not sure), if you don't seed it, it starts with a random seed from /dev/random.

If you need crypto-grade random numbers, read /dev/random yourself. But keep in mind that /dev/random can only generate a limited rate. /dev/urandom uses /dev/random to generate a faster rate, but isn't as high-grade in its entropy pool.

If your table is not too big (let's say max 1000 records) it doesn't really matter. But for big tables you must choose an alternative way.

This article may help you:

http://www.titov.net/2005/09/21/do-not-use-order-by-rand-or-how-to-get-random-rows-from-table/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM