简体   繁体   English

在没有 ROW_NUMBER 的多列中选择具有最小值的行

[英]Select row with least value in multiple columns without ROW_NUMBER

I want to get the row per group with the min value of two columns.我想用两列的最小值获取每组的行。

I have a table that has listings for items I want, as well as their cost and distance from me.我有一张桌子,上面列出了我想要的物品,以及它们的成本和与我的距离。

mytable:
item | cost | dist
-----+------+---------
1    | $2   | 1.0
1    | $3   | 0.5
1    | $4   | 2.0
2    | $2   | 2.0
2    | $2   | 1.5
2    | $2   | 4.0
2    | $8   | 1.0
2    | $12  | 3.0
3    | $1   | 5.0

For each item, I want to get the row that has the min cost, then if there are multiple of the min cost, get the one with the min dist对于每个项目,我想获取具有最小成本的行,然后如果有多个最小成本,则获取具有最小分布的那个

so my result would be所以我的结果是

item | cost | dist
-----+------+---------
1    | $2   | 1.0
2    | $2   | 1.5
3    | $1   | 5.0

I know I can achieve this result using我知道我可以使用

SELECT * 
, ROW_NUMBER() OVER(PARTITION BY item ORDER BY cost ASC, dist ASC) as [RID]
FROM mytable
WHERE [RID] = 1

but the problem comes when I have 100,000 items each with 100,000 listings, and sorting the whole table becomes incredibly time-consuming.但是当我有 100,000 个项目和 100,000 个列表时,问题就出现了,并且对整个表格进行排序变得非常耗时。

Since I only need the top 1 of each group, I'm wondering if there is another way to get the result I want without sorting the whole table of 10,000,000,000 entries.由于我只需要每个组的前 1 个,我想知道是否有另一种方法可以获得我想要的结果,而无需对 10,000,000,000 个条目的整个表进行排序。

Currently using SQL Server 2012当前使用 SQL Server 2012

A nice article on this topic is by Itzik Ben Gan - Optimizing TOP N Per Group Queries . Itzik Ben Gan - Optimizing TOP N Per Group Queries撰写了有关此主题的一篇不错的文章。 This discusses a concatenation approach.这讨论了串联方法。

For example if your table is例如,如果您的桌子是

CREATE TABLE #YourTable
  (
     item INT,
     cost MONEY CHECK (cost >= 0),
     dist DECIMAL(10, 2) CHECK (dist >= 0)
  ) 

you might use你可能会用

WITH T AS
(
SELECT item,  
       MIN(FORMAT(CAST(cost * 100 AS INT), 'D10') + FORMAT(CAST(dist * 100 AS INT), 'D10')) AS MinConcat
FROM #YourTable
GROUP BY item
)
SELECT item,
       CAST(LEFT(MinConcat,10)/100.0 AS MONEY),
       CAST(RIGHT(MinConcat,10)/100.0 AS  DECIMAL(10,2))
FROM T

So this can be done in a single grouping operation on id (which could be a hash aggregate without any sort).所以这可以在id上的单个分组操作中完成(它可以是没有任何排序的散列聚合)。

You need to be careful that the value of the concatenated result has the same ordering when treated as a string as cost, dist would have when treated as raw column values so the query above may need adjusting if your datatypes are different.您需要注意连接结果的值在作为cost, dist处理的字符串时具有相同的顺序cost, dist当作为原始列值处理时cost, dist将具有相同的顺序,因此如果您的数据类型不同,上面的查询可能需要调整。

It currently reserves the left most 10 characters for cost represented as an integer number of pence and padded with leading zerores, and dist as a 10 digit integer similarly.它目前保留最左边的 10 个字符作为cost表示为便士整数并用前导零填充,类似地将dist为 10 位整数。

You can do this way你可以这样做

; with c as 
(select min(cost) as cost, item
from mytable
group by item)
select t.* from mytable t
inner join c
on c.item = t.item and c.cost=t.cost;

However, I'd recommend you to add index to item and cost columns to make the query fast.但是,我建议您为itemcost列添加索引以加快查询速度。

[Edit] After re-reading the OP question, it should be like the following when there are ties in cost, [编辑]重新阅读OP问题后,当成本有关系时,它应该如下所示,

; with c as 
(select min(cost) as cost, item
from mytable
group by item)
, c2 as (
select t.cost, t.item, min(dist) as dist from mytable t
inner join c
on c.item = t.item and c.cost=t.cost
group by t.cost, t.item)
select  t.item,t.cost, c2.dist from mytable t
inner join c2
on c2.item = t.item, and c2.cost = t.cost;

Maybe there are better ways, but this should work.也许有更好的方法,但这应该有效。

If you have a table of items, then this might work:如果您有一个项目表,那么这可能有效:

select i.*, t.*
from items i cross apply
     (select top (1) t.*
      from t
      where t.item = i.item
      order by cost, dist
     ) t;

For this to be efficient, you need an index on (item, cost, dist) .为了使其高效,您需要(item, cost, dist)上的索引。

Something like this should work:这样的事情应该工作:

select
    t.item, MIN(t.cost) as mincost, min(t2.mindist) as mindist
from mytable t
inner join (
select item, cost, MIN(dist) as mindist
    from mytable
    group by
        item, cost
) t2 on t.item = t2.item
group by t.item,t2.cost
having MIN(t.cost) = t2.cost

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM