简体   繁体   中英

quick random row in PostgreSQL: why time (floor(random()*N)) + (select * from a where id = const) 100 times less then select where id = random?

I need to quickly select row from PostgreSQL query. I've read Best way to select random rows PostgreSQL . quick random row selection in Postgres

By far quickest of I've read is:

CREATE EXTENSION IF NOT EXISTS tsm_system_rows;
SELECT myid  FROM mytable TABLESAMPLE SYSTEM_ROWS(1);

2 ms average. But as noted in comments it is not "completely random".

I've tried

SELECT id FROM a OFFSET floor(random()*3000000) LIMIT 1;

15-200 ms.

The simplest idea is to select by id as my ids are continuous. But

select floor(random ()*1000); 2ms
select * from a where id=233; 2ms (and again 2ms for other constants)

but

SELECT * FROM a where id = floor(random ()*1000)::integer; 300ms!!!

Why 300 not 4? Is is possible to reorder somehow, hints etc. to make 4 ms?

This is because random() is defined as volatile, so Postgres evaluates it for each and every row again - effectively going through all rows.

If you want to prevent that, "hide" it behind a (otherwise useless) subselect:

SELECT * 
FROM a 
where id = (select trunc(random ()*1000)::integer);

The following pertains strictly to OP question following the answer by @a-horse-with_no-name: Strangely it becomes long w/out ::integer. Why is that?

Because ::integer is a Postgres extension to the SQL standard "select cast( number as integer)" The type returned by RANDOM() is double precision and remains so after the TRUNC() function is applied. What's displayed is determined by your system.

In its general form the structure val::data_type says to cast val to the data_type specified (providing a valid cast function exists). If val is itself an expressing the format becomes (val)::data_type.
The following show step-by-step what a-horse-with-no-name's query is doing, and indicates the data type for that step. The CTE is strictly so that each step uses the same value as using random() each time would generate different values.

with gen  as (select random() n)
select  n,pg_typeof(n)                          --step1 get random value interval [0-1). 
     ,  n*1000, pg_typeof(n*1000)               -- get value into interval [0-999.9999...)  
     ,  trunc(n*1000), pg_typeof(trunc(n*1000)) -- reduce to interval [0,999.000)
     ,  trunc(n*1000)::integer, pg_typeof(trunc(n*1000)::integer) 
  from gen;                                     -- cast to integer interval [0-999)  

BTW the trunc() function is not strictly needed in above as casting a double to an integer discards any decimal digits.

I hope this helps you understand what's happening.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM