简体   繁体   中英

How do you efficiently (in a DB independent manner) select random records from a table?

This seems like an incredibly simple problem however it isn't working out as trivially as I'd expected.

I have a club which has club members and I'd like to pull out two members at random from a club.

Using RANDOM()

One way is to use random ordering:

club.members.find(:all, :order => 'RANDOM()').limit(2)

However that is different for SqLite (the dev database) and Postgres (production) since in MySql the command is RAND() .

While I could start writing some wrappers around this I feel that the fact that it hasn't been done already and doesn't seem to be part of ActiveRecord tells me something and that RANDOM may not be the right way to go.

Pulling items out directly using their index

Another way of doing this is to pull the set in order but then select random records from it:

First off we need to generate a sequence of two unique indices corresponding to the members:

all_indices = 1..club.members.count
two_rand_indices = all_indices.to_a.shuffle.slice(0,2)

This gives an array with two indices guaranteed to be unique and random. We can use these indices to pull out our records

@user1, @user2 = Club.members.values_at(*two_rand_indices)

What's the best method?

While the second method is seems pretty nice, I also feel like I might be missing something and might have over complicated a simple problem. I'm clearly not the first person to have tackled this so what is the best, most SQL efficient route through it?

The problem with your first method is that it sorts the whole table by an unindexable expression, just to take two rows. This does not scale well.

The problem with your second method is similar, if you have 10 9 rows in your table, then you will generate a large array from to_a . That will take a lot of memory and time to shuffle it.

Also by using values_at aren't you assuming that there's a row for every primary key value from 1 to count, with no gaps? You shouldn't assume that.

What I'd recommend instead is:

  1. Count the rows in the table.

     c = Club.members.count 
  2. Pick two random numbers between 1 and the count.

     r_a = 2.times.map{ 1+Random.rand(c) } 
  3. Query your table with limit and offset .
    Don't use ORDER BY , just rely on the RDBMS's arbitrary ordering.

     for r in r_a row = Club.members.limit(1).offset(r) end 

See also:

The Order By RAND() function in MySQL:

ORDER BY RAND() LIMIT 4

This will select a random 4 rows when the above is the final clause in the query.

尝试使用randumb gem,它实现了您提到的第二种方法

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM