How to Select top n Records for Each Category

Question

I'm trying to write a query that joins 2 tables and will give me the top 5 names and the amount of items they sold at a location between a certain date range ie 01-01-2016 through 12-31-2017.

From what I've been researching, This is what I came up with:

SELECT
    EmployeeName, 
    COUNT(ID) AS 'Number of Deals', 
    CompanyNumber
FROM
(
    SELECT
        EmployeeName, 
        DealID, 
        CompanyNumber,
        ROW_NUMBER() OVER (PARTITION BY CompanyNumber ORDER BY DealID) AS rn
    FROM Deal
    JOIN DealEmployee
        ON Deal.DealID. =DealEmployee.DealID AS T
    WHERE
        Deal.Status = 2 AND
        Date BETWEEN '2016-01-01' AND '2017-12-31' AND
        EmployeeName != '' AND T.rn <=5

I am hoping to get the the below result:

I am quite the novice, and I know my syntax is incorrect. Any help given would be greatly appreciated

Answer 1

not tested, but I would try something like:

with
basedata as (    
select EmployeeName
     , CompanyNumber
     , COUNT(ID) as Number_of_Deals
  from Deal
  join DealEmployee
    on Deal.DealID = DealEmployee.DealID    
 where Deal.Status = 2
   and Date between '2016-01-01' and '2017-12-31'
   and EmployeeName !=''
 group by EmployeeName
        , CompanyNumber 
)
,
basedata_with_rank as (
select t.*
     , row_number() over (partition by CompanyNumber order by Number_of_Deals desc) rn
  from basedata
)
select *
  from basedata_with_rank
 where rn <= 5
 order by CompanyNumber
        , Number_of_Deals desc

Using CTEs makes queries usually more readable. By the way: I would avoid to give a column the name "date" (a reserved word) and I would always use qualified names for my columns. Perhaps the use of the analytic functions rank or dense rank would be more appropriate, but row_number should also work.

Answer 2

You need to filter on the row number outside the subquery, after it has been calculated/materialized. I think this is what you want:

SELECT EmployeeName, CompanyNumber, cnt
FROM
(
    SELECT EmployeeName, CompanyNumber, COUNT(*) AS cnt,
        ROW_NUMBER() OVER (PARTITION BY CompanyNumber ORDER BY COUNT(*) DESC) rn
    FROM Deal d
    INNER JOIN DealEmployee de
        ON d.DealID = de.DealID
    WHERE d.Status = 2 AND Date BETWEEN '2016-01-01' AND '2017-12-31' AND
          EmployeeName != ''
    GROUP BY EmployeeName, CompanyNumber
) t
WHERE rn <= 5
ORDER BY
    CompanyNumber,
    cnt DESC;

Note that we perform the GROUP BY aggregation inside the subquery, where we also compute the row number.

Answer 3

I think you want this:

SELECT EmployeeName, num_deals, CompanyNumber
FROM (SELECT EmployeeName, CompanyNumber, COUNT(*) as num_deals,
             ROW_NUMBER() OVER (PARTITION BY CompanyNumber ORDER BY COUNT(*) DESC) AS seqnum
      FROM Deal d JOIN
           DealEmployee de
           ON Deal.DealID. = de.DealID 
      WHERE d.Status = 2 AND
            Date BETWEEN '2016-01-01' AND '2017-12-31' AND
            EmployeeName <> ''
      GROUP BY employee, CompanyNumber
     ) t
WHERE seqnum <= 5;

Notes:

You need to aggregate before doing the row_number() , because you want to enumerate based on the count. (This version combines the aggregation and row_number() into a single subquery; the aggregation goes first.)
The number of deals is aggregated by employee and company.
When you have multiple tables, qualify all column names.
Your query has a bunch of other errors, such as misplaced table aliases and some lines are missing.

How to Select top n Records for Each Category

Question

3 answers

solution1
2 ACCPTED 2018-08-09 15:17:03

solution2
1 2018-08-09 15:15:14

solution3
0 2018-08-09 16:36:28

How to Select top n Records for Each Category

Question

3 answers

solution1 2 ACCPTED 2018-08-09 15:17:03

solution2 1 2018-08-09 15:15:14

solution3 0 2018-08-09 16:36:28

solution1
2 ACCPTED 2018-08-09 15:17:03

solution2
1 2018-08-09 15:15:14

solution3
0 2018-08-09 16:36:28