[英]Select max value of each group that depends on other column
EmpNumber City Total Sales
----------------------------------------------------------------------------------
1811 Boston $14557260.03
1862 Boston $12435892.06
1873 Boston $9786058.60
1803 Chichago $18266965.58
1825 Chichago $11958100.98
1877 Chichago $15569868.52
My table looks like this.我的桌子看起来像这样。 May I know how do I get the best employee from particular city according to their sales?
我可以知道我如何根据他们的销售额从特定城市获得最好的员工吗?
Desired output:期望的输出:
EmpNumber City Total Sales
----------------------------------------------------------------------------------
1811 Boston $14557260.03
1803 Chichago $18266965.58
I have tried我试过了
select employeenumber, city, max(TotalSales)
from(
select employeenumber, a.city, sum(quantityordered*priceeach) as TotalSales
from offices a, employees b, customers c, orders d, orderdetails e
where a.officeCode = b.officeCode
and b.employeenumber = c.salesrepemployeenumber
and c.customernumber = d.customernumber
and d.ordernumber = e.ordernumber
group by employeenumber, a.city
order by a.city)
group by employeenumber, city;
But I still get 3 employees from Boston and 3 employees from Chichago.但我仍然有来自波士顿的 3 名员工和来自芝加哥的 3 名员工。 What I want is only ONE employee from each of the cities.
我想要的只是来自每个城市的一名员工。 Thank you
谢谢
Just use row_number()
analytical function :只需使用
row_number()
分析函数:
select employeenumber, city, TotalSales
from
(
select employeenumber, a.city, nvl(quantityordered,0)*nvl(priceeach,0) as TotalSales
row_number() over
( partition by o.city order by nvl(quantityordered,0)*nvl(priceeach,0) desc )
as rn
from offices off
join employees e on off.officeCode = e.officeCode
join customers c on e.employeenumber = c.salesrepemployeenumber
join orders ord on c.customernumber = ord.customernumber
join orderdetails odd on ord.ordernumber = odd.ordernumber
)
where rn = 1
If tie(equality of TotalSales) occurs for top values of TotalSales and they should be included in the result, then replace row_number()
with dense_rank()
which's another analytical function.如果 TotalSales 的最高值出现 tie(equality of TotalSales) 并且它们应该包含在结果中,那么将
row_number()
替换为dense_rank()
,这是另一个分析函数。
This will get you your desired answer after you created a temp table named tbl
from the first dataset you shared above.在您从上面共享的第一个数据集创建一个名为
tbl
的临时表后,这将为您提供所需的答案。
select EmpNumber, City, Max_Sales as `Max Sales` from
(select City, max(`Total Sales`) as `Max_Sales`
from tbl group by City) a
left join
(select `Total Sales` as drop_later, EmpNumber from tbl) b
on a.Max_Sales = b.drop_later
This is the output in Spark SQL:这是 Spark SQL 中的输出:
EmpNumber City Max Sales
0 1811 Boston 14557260.03
1 1803 Chichago 18266965.58
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.