I have seen a few questions like this, but nothing has answered what I'm looking for.
I have 5,000 rows of data from over 3 years. Every line has a memberID, so memberIDs repeat and are only unique to an individual (but they will repeat in the column if the individual is in the system multiple times over 3 years).
How can I pull 100 random memberIDs over the course of 3 years? (So theoretically there would be more than 100 lines because memberIDs can repeat)
EDIT: I should clarify, Member ID is character, not numeric. Ex: W4564
NOTE: This is NOT looking for n rows, rather 100 different IDs over the course of 3 years, so an ID might be associated with 3 rows in the result. The result will have a differing number of rows each time the SQL is run.
Depending on how your data is indexed, you could simply grab the rows with the memberID
from a subquery. For example:
SELECT *
FROM <yourtable>
WHERE memberID IN (SELECT DISTINCT TOP 100 memberID FROM <yourtable>)
That should return random memberIDs, depending on your index. If you need to force it, you can do like in the linked question in the comments, and sort it randomly:
SELECT *
FROM <yourtable>
WHERE memberID IN (SELECT DISTINCT TOP 100 memberID FROM <yourtable> ORDER BY newid())
Using order by newid()
you can use a random sorting. Using where exists
you can isolate only those members for which data exists in the past three years. You need to do that in this stage, otherwise you might accidentally end up with only members that don't have any recent data at all. By adding top 100
you can select just 100 rows out of the set.
The combination should get the 100 random member ids for which data exists in the past three years:
select top 100
m.MemberID
from
Member m
where
exists (select 'x'
from MemberData d
where d.MemberId = m.MemberId
and d.DataDate > dateadd(year, -3, getdate()))
order by
newid()
Then you could use that query in an in
clause to get data from the same MemberData table, or any other table for that matter:
select
md.*
from
MemberData md
where
-- Same filter to get only the recent data
md.DataDate > dateadd(year, -3, getdate()) and
-- Only of 100 random members that have been active in the past 3 years.
md.MemberId in (
select top 100
m.MemberID
from
Member m
where
exists (select 'x'
from MemberData d
where d.MemberId = m.MemberId
and d.DataDate > dateadd(year, -3, getdate()))
order by
newid()
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.