简体   繁体   中英

SQL Server 2000: How to get top 10 for each distinct field. A loop Maybe?

I have the following table

MyTable
  ID
  MessageType
  MessageDate
  MessageBody

The table is a few million rows but there are only 100 unique MessageType in it.

What I need is a sample of each MessageType (must include at least MessageType and MessageBody), but I can't do a DISTINCT as that only gets me the MessageType column.

I am thinking something like

SELECT TOP 5 *
 FROM MyTable
 WHERE MessageType IN (SELECT DISTINCT MessageType FROM MyTable)

I know this doesn't work as it just me the top 5, but I am not sure how to make SQL loop through this.

Thanks for any help

The Row_Number version

;WITH cte AS
(
SELECT   ID,
  MessageType,
  MessageDate,
  MessageBody,
  ROW_NUMBER() OVER (PARTITION BY MessageType ORDER BY (SELECT 0)) AS RN
FROM MyTable
)
SELECT   ID,
  MessageType,
  MessageDate,
  MessageBody
FROM cte 
WHERE RN <=5

The CROSS APPLY version

WITH m1 AS
(
SELECT DISTINCT MessageType
FROM MyTable
)

SELECT m2.*
FROM  m1
CROSS APPLY
(
SELECT TOP 5 *
 FROM MyTable m2
 WHERE m2.MessageType = m1.MessageType
 ) m2

Martin, if I'm reading your answer correctly, I think what you will produce is 5 samples of each message. Marc_s just wants one sample from each message.

I think what you need is:

SELECT  ID,
        MessageType,
        MessageDate
FROM (
    SELECT ID,
           MessageType,
           MessageDate,
           ROW_NUMBER() OVER (PARTITION BY MessageType, ORDER BY NEWID() ) AS RN
       -- I am using NewID() because it will produce a nice random sampling, 
       -- but Mark's SELECT(0) will be faster.
    FROM   MyTable
    ) sampling
WHERE RN =1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM