SQL Server 2005 Syntax Help - “Select Info based upon Max Value of Sub Query”

Question

The objective is below the list of tables.

Tables:

Table: Job

JobID
CustomerID
Value
Year

Table: Customer

CustomerID
CustName

Table: Invoice

SaleAmount
CustomerID

The Objective

Part 1: (easy) I need to select all invoice records and sort by Customer (To place nice w/ Crystal Reports)

Select * from Invoice as A inner join Customer as B on A.CustomerID = B.CustomerID

Part 2: (hard) Now, we need to add two fields:

JobID associated with that customer's job that has the Maximum Value (from 2008)
Value associated with that job

Pseudo Code

Select * from 
Invoice as A
inner join Customer as B on A.CustomerID = B.CustomerID
inner join 
(select JobID, Value from Jobs where Job:JobID has the highest value out of all of THIS customer's jobs from 2008)

General Thoughts

This is fairly easy to do If I am only dealing with one specific customer:
select max(JobId), max(Value) as MaxJobID from Jobs where Value = (select max(Value) from Jobs where CustomerID = @SpecificCustID and Year = '2008') and CustomerID = SpecificCustID and CustomerID = '2008'

This subquery determines the max Value for this customer in 2008, and then its a matter of choosing a single job (can't have dupes) out of potential multiple jobs from 2008 for that customer that have the same value.

The Difficulty

What happens when we don't have a specific customer ID to compare against? If my goal is to select ALL invoice records and sort by customer, then this subquery needs access to which customer it is currently dealing with. I suppose this can "sort of" be done through the ON clause of the JOIN, but that doesn't really seem to work because the sub-sub query has no access to that.

I'm clearly over my head. Any thoughts?

Answer 1

How about using a CTE . Obviously, I can't test, but here is the idea. You need to replace col1, col2, ..., coln with the stuff you want to select.

Inv( col1, col2, ... coln)
AS
(
  SELECT col1, col2, ... coln,
     ROW_NUMBER() OVER (PARTITION BY A.CustomerID 
     ORDER BY A.Value DESC) AS [RowNumber]
      FROM Invoice A INNER JOIN Customer B ON A.CustomerID = B.CustomerID
      WHERE A.CustomerID = @CustomerID
      AND A.Year = @Year
)
SELECT * FROM Inv WHERE RowNumber = 1

If you don't have a CustomerID, this will return the top value for each customer (that will hurt on performance tho).

Answer 2

The over function is an awesome, but often neglected function. You can use it in a subquery to pull back your valid jobs, like so:

select
    a.*
from
    invoice a
    inner join customer b on
        a.customerid = b.customerid
    inner join (select customerid, max(jobid) as jobid, maxVal from
               (select customerid, 
                jobid,
                value,
                max(value) over (partition by customerid) as maxVal
                from jobs
                where Year = '2008') s
               where s.value = s.maxVal
               group by customerid, maxVal) c on
        b.customerid = c.customerid
        and a.jobid = c.jobid

Essentially, that first inner query looks like this:

select 
    customerid, 
    jobid,
    value,
    max(value) over (partition by customerid) as maxVal
from jobs
where Year = '2008'

You'll see that this pulls back all of the jobs, but with that additional column which lets you know what the maximum value is for each customer. With the next subquery, we filter out any rows that have value and maxVal equal. Additionally, it finds the max JobID based on customerid and maxVal , because we need to pull back one and only one JobID (as per the requirements).

Now, you have a complete listing of CustomerID and JobID that meet the conditions of having the highest JobID that contains the maximum Value for that CustomerID in a given year. All that's left is to join it to Invoice and Customer , and you're good to go.

Answer 3

The row_number() function can give you what you need:

Select A.*, B.*, C.JobID, C.Value
from 
Invoice as A
inner join Customer as B on A.CustomerID = B.CustomerID
inner join (
   select JobID, Value, CustomerID,
   ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY Value DESC) AS Ordinal
   from Jobs
   WHERE Year = 2008
) AS C ON (A.CustomerID = C.customerID AND C.Ordinal = 1)

The ROW_NUMBER() function in this query will order by value in descending order and the PARTITION BY clause will do this separately for each different value of CustomerID. This means that the highest Value for each customer will always be 1, so we can join to that value.

Answer 4

Just to be complete with the non row_number solution for those < MSSQL 2005. Personanly, I find it easier to follow myslef...but that could be biased considering how much time I spend in MSSQL 2000 vs 2005+.

SELECT * 
FROM Invoice as A
INNER JOIN Customer as B ON
    A.CustomerID = B.CustomerID
INNER JOIN (
    SELECT
       CustomerId,
       --MAX in case dupe Values. 
       ==If UC on CustomerId, Value (or CustomerId, Year, Value) then not needed
       MAX(JobId) as JobId 
    FROM Jobs
    JOIN (
        SELECT
            CustomerId,
            MAX(Value) as MaxValue
        FROM Jobs
        WHERE Year = 2008
        GROUP BY
            CustomerId
    ) as MaxValue ON
         Jobs.CustomerId = MaxValue.CustomerId
         AND Jobs.Value = MaxValue.MaxValue
    WHERE Year = 2008
    GROUP BY
         CustomerId
) as C ON
     B.CustomerID = C.CustomerID

SQL Server 2005 Syntax Help - “Select Info based upon Max Value of Sub Query”

Question

4 answers

solution1
2 2009-08-11 04:50:29

solution2
1 2009-08-11 04:49:10

solution3
1 ACCPTED 2009-08-11 04:53:29

solution4
1 2009-08-11 23:36:38

SQL Server 2005 Syntax Help - “Select Info based upon Max Value of Sub Query”

Question

4 answers

solution1 2 2009-08-11 04:50:29

solution2 1 2009-08-11 04:49:10

solution3 1 ACCPTED 2009-08-11 04:53:29

solution4 1 2009-08-11 23:36:38

solution1
2 2009-08-11 04:50:29

solution2
1 2009-08-11 04:49:10

solution3
1 ACCPTED 2009-08-11 04:53:29

solution4
1 2009-08-11 23:36:38