简体   繁体   中英

Select a random row from table, but with odds?

I have a table that describes a variety of objects in my system (ie. umbrella, boots, satchel, whatever). Each one of these objects needs to have a distinct prevalence or incidence. For example, the umbrella is rarer than the boots. Based on those factors, I need to randomly select a single object (including a blank or 'no object found') based on that incidence value.

Yikes. Make sense?

SELECT * FROM some_table
WHERE (100*RAND()) > some_table.percent_probability
LIMIT 1

....and the probability of selection is stored in the percent_probability field.

C.

If you have a write-seldom-read-many scenario (ie you change the objects and the probabilities seldom) you might want to pre-calculate the probability values so that if you have a single random value you can unambiguously decide which object to pick (with a single pick, no sorting, no comparison of all records needed).

Eg (probabilities in per-mill)
umbrella: 500‰ chance
boots: 250‰ chance
satchel: 100‰ chance
whatever: 100‰ chance
"nothing": 50‰ chance

A random number between 0 and 499 means "umbrella" has been picked, 500-749 "boots" and so on.

INSERT INTO foo (name, randmin, randmax) VALUES
  ('umbrella', 0, 499),  
  ('boots', 500, 749),
  ('satchel', 750, 849), 
  ('whatever', 850, 949) 

Every time you add an object or modify the probabilities re-create this table.

Then all you need is a query like

SELECT
  f.name
FROM
  (  
    SELECT Round(Rand()*1000) as r    
  )  as tmp
JOIN
  foo as f  
ON
  r BETWEEN f.randmin and f.randmax  
LIMIT
  1

Only one random value has to be generated and MySQL can use an index on (randmin,randmax) to find the record quickly.

I'm going to modify symcbean's answer for this, +1 for symcbean.

SELECT * FROM some_table
WHERE (100*RAND()) < some_table.percent_probability

This will return ALL results that match the probability you intuitively want to assign to them. For example, 5 objects with a probability of 20 will all be returned 20% of the time. Objects with a value of 90 will be returned 90% of the time.

So your result will be more than one object, but you've kept the rare ones from showing up as often. So now just grab one of your results at random. An easy way would be to stick them all in an array and:

$items = array(); // assuming you've already filled $items with your 
                  // query results, one item for each array key

$count = count($items);

$chosen_key = rand(1,$count)-1;

$chosen_item = $items[$chosen_key];

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM