I have a table with the below schema in AWS Athena
Number of unique standard_lab_parameter_name, units pair in DB is ~3k & the DB has about 80 Million entries. Now, I wish to get 1000 samples(random) per unique standard_lab_parameter_name, units pair hence nearly 3k x 1000 rows. I tried searching the internet for any such query but in vain. Any help?
You can use a CTE to generate random row numbers for each standard_lab_parameter_name
, units
pair, and then select the first 1000 rows for each pair by requiring the row number to be <= 1000
:
WITH CTE AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY standard_lab_parameter_name, units ORDER BY RANDOM()) AS rn
FROM yourtable
)
SELECT *
FROM CTE
WHERE rn <= 1000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.