I have a table like:
SELECT
s.date,
s.orderid,
s.num1,
s.num2,
s.sales,
s.price
FROM sales AS s
Resulting in
date | orderid | num1 | num 1 | sales | price
2020-11-01 | 1 | a | aa | 1 | 10
2020-11-01 | 8 | k | kk | 1 | 10
2020-11-02 | 1 | a | aa | -1 | 10
2020-11-01 | 2 | b | bb | 2 | 8
2020-11-01 | 3 | c | cc | 1 | 10
2020-11-01 | 3 | c | cc | 2 | 9
2020-11-04 | 18 | u | uu | 5 | 2
"orderid" and "num1" should only appear once, otherwise it's a return (second entry has "sales" of -1, negating the earlier sales. So, I need to remove those entries completely (not keeping a row). Otherwise, "orderid" has no meaning and is not needed.
I want to group by "date", "num1" and "num2", summing up all sales and getting the average price while removing orderids+num1 that appear more than once together.
End result should be:
date | orderid | num1 | num 1 | sales | price
2020-11-01 | 8 | k | kk | 1 | 10
2020-11-01 | 2 | b | bb | 2 | 8
2020-11-01 | 3 | c | cc | 3 | 9.5
2020-11-04 | 18 | u | uu | 5 | 2
How can I do this with a Groupby? So far I have this:
SELECT
s.date,
s.num1,
s.num2,
SUM(s.sales),
AVG(s.price)
FROM sales AS s
GROUP BY s.date, s.num1, s.num2
You can use window functions. Based on your description (removing orders that appear more than once), you can use count(*)
:
select s.date, s.num1, s.num2, SUM(s.sales), AVG(s.price)
from (select s.*, count(*) over (partition by orderid, num1) as cnt
from sales s
) s
where cnt = 1
group by s.date, s.num1, s.num2;
I suspect you really want row_number()
, so you keep one of the duplicate rows.
You can use group by
and having
as follows:
SELECT max(s.date) as date,
S.orderid,
s.num1,
s.num2,
SUM(s.sales),
AVG(s.price)
FROM sales AS s
GROUP BY s.orderid, s.num1, s.num2
Having sum(sales) > 0;
Question:
Is this a transaction log where you have orderId 1 with sales entries 10, 5, -1, 7, 8 which should result in a value of 15? The 10 and 5 are negated by the -1. If so, you need to do a query which a) Finds all rows after the last -1 for that orderId and sum up the sales values.
Case for this is sales values for the same orderId of 5, 6, -1, 7, 9, -1, 10, 2 which should only use 10 and 2 for the final amount
Something like
Query 1 - find the max(date) value for each order id where sales amount is -1 Query 2 - Use query 1 to get all transactions for each orderId where date > the date in query 1
WITH (define query 1 here)
SELECT s.OrderId, Sum(s.sales) as TotalSales, Avg(s.Price) as AveragePrice
FROM sales s
LEFT OUTER JOIN (query 1) q1 ON q1.OrderId
WHERE (q1.Date is null) OR (s.Date > q1.Date)
GROUP BY s.OrderId
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.