简体   繁体   中英

Why CROSS APPLY with columns and aggregate functions needs Group by

QUESTION

Why CROSS APPLY query with columns and aggregate functions mixed on SELECT doesn't work? it requires GROUP BY clause or all columns with aggregate function.

CODE

CREATE TABLE A
(
A_ID int identity(1,1), 
A_NAME Varchar(20)
)

INSERT INTO A (A_NAME)
VALUES
('A'), ('AA'), ('AAA')

CREATE TABLE B
(
B_ID int identity(10,10), 
B_NAME Varchar(20),
A_ID int
)

INSERT INTO B (B_NAME, A_ID)
VALUES
('B', 1), ('BB', 3), ('BBB', 2)

--DOESNT WORK IF THE SELECT HAS COLUMNS MIXED WITH AGGREGATE FUNCTION

    SELECT A_NAME, MAX(B_NAME)
    FROM A
    CROSS APPLY(SELECT * FROM B WHERE B.A_ID = A.A_ID) as AxB

--WORKS WITH AGGREGATE FUNCTIONS

    SELECT MAX(A_NAME), MAX(B_NAME)
    FROM A
    CROSS APPLY(SELECT * FROM B WHERE B.A_ID = A.A_ID) as AxB

--WORKS WITH GROUP BY

    SELECT A_NAME, MAX(B_NAME)
    FROM A
    CROSS APPLY(SELECT * FROM B WHERE B.A_ID = A.A_ID) as AxB
    GROUP BY A_NAME

--RESET

DROP TABLE A
DROP TABLE B

UPDATE

CONCLUSION (Thanks @Richardissimo's)

I have a Sybase 15.7 background and there you can mix columns with aggregate functions and have no Group by Clause. in those situations the default group would be the whole table. so in my code I was expecting the Maximum B_NAME for all A_NAMES in A Table.

However, in SQL Server when I tested the code without the cross apply I received the same error. It Looks Like SQL Server forbids this type of behavior. Which is kinda cool, with this they are saying that you will not get more rows than the number of groups on the aggregated query.

THAT'S CONSISTENCY

The issue isn't related to cross apply , it relates to the way aggregation works. Let's work backwards through your 3 scenarios:

SELECT A_NAME, MAX(B_NAME) with Group By A_NAME means you get a row back for each distinct A_NAME, and each one will be accompanied by the 'biggest' value of B_NAME where the A_NAME has that value.

SELECT MAX(A_NAME), MAX(B_NAME) has no group by, which means the grouping is done across all the rows, meaning that the query will only ever return a single row.

SELECT A_NAME, MAX(B_NAME) without Group By doesn't make sense. If it was just SELECT A_NAME , it would return the value in A_NAME for each of the rows. So putting MAX(B_NAME) in the same query makes no sense, as without a Group By it has no way of knowing what to aggregate over.

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM