[英]Selecting one distinct column, I.E Finetuning select query
I use the bellow query to merge 1 (or possibly sevral tables and is generated from a php script I have written) on Id's that are not the same but parts of Id in first table is part of the second Id so thats why I do a Substring on it. 我使用波纹管查询在ID上合并1(或者可能是几个表,并且是从我编写的php脚本生成的),这些ID并不相同,但是第一个表中的Id部分是第二个Id的一部分,因此我为什么要做一个子串就可以了。 In the most perfect world I would just left join on id in both tables if there were the same but they're not. 在最完美的世界中,如果存在相同的表,那么我只会在两个表中保留id的联接,但事实并非如此。
select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0
Inner JOIN
(select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1
on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) ORDER BY t1.Time DESC
Here I get Alot of fields that are the same except for StopTime, see example bellow: 在这里,我得到了很多除了StopTime之外都相同的字段,请参见下面的示例:
StopTime Time
2015-04-01 23:59:00.000 2015-04-18 23:00:01
2015-04-02 23:59:00.000 2015-04-18 23:00:01
2015-04-03 23:59:00.000 2015-04-18 23:00:01
2015-04-04 23:59:00.000 2015-04-18 23:00:01
2015-04-05 23:59:00.000 2015-04-18 23:00:01
2015-04-06 23:59:00.000 2015-04-18 23:00:01
2015-04-07 23:59:00.000 2015-04-18 23:00:01
2015-04-08 23:59:00.000 2015-04-18 23:00:01
2015-04-09 23:59:00.000 2015-04-18 23:00:01
2015-04-10 23:59:00.000 2015-04-18 23:00:01
2015-04-11 23:59:00.000 2015-04-18 23:00:01
2015-04-12 23:59:00.000 2015-04-18 23:00:01
2015-04-13 23:59:00.000 2015-04-18 23:00:01
2015-04-14 23:59:00.000 2015-04-18 23:00:01
2015-04-15 23:59:00.000 2015-04-18 23:00:01
2015-04-16 23:59:00.000 2015-04-18 23:00:01
2015-04-17 23:59:00.000 2015-04-18 23:00:01
2015-04-18 23:59:00.000 2015-04-18 23:00:01
But here I only want unique Time, is it possible to get a row with the unique Time and the latest StopTime? 但是在这里我只想要唯一的时间,是否可以将唯一的时间和最新的StopTime排在一起?
Like the bellow? 像风箱一样?
StopTime Time
2015-04-01 23:59:00.000 2015-04-18 23:00:01
I tried with a group by
statement inside the second select statement like: 我尝试在第二个select语句中使用group by
语句,例如:
(select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content] group by StopTime)
But I get a syntax error 但是我收到语法错误
Column 'dbcust.dbo._Content.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. 列'dbcust.dbo._Content.Id'在选择列表中无效,因为它既不包含在聚合函数中,也不包含在GROUP BY子句中。
Perhaps you guys can help me with finetuning my select query in order to speed up the data gathering? 也许你们可以帮助我优化选择查询以加快数据收集速度? =) =)
Thanks in advance. 提前致谢。
You can do this: 你可以这样做:
WITH YourQuery
AS
(
select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0
Inner JOIN
(select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1
on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1))
), Ranked
AS
(
select Id,
CustomerName,
Region,
Country,
StopTime,
CustomerId,
Id,
Time,
ROW_NUMBER() OVER(PARTITION BY Time ORDER BY StopTime DESC) AS R1
from YourQuery
)
SELECT Id,
CustomerName,
Region,
Country,
StopTime,
CustomerId,
Id,
Time
FROM Ranked
WHERE RN = 1;
The ROW_NUMBER
function will give a ranking number for each time
, so selecting where rn = 1
will give you the latest stoptime
. 该ROW_NUMBER
函数将给予每个一个等级数time
,所以选择where rn = 1
会给你最新的stoptime
。
For the query you tried: 对于您尝试的查询:
select distinct Id,CustomerName,Region,Country,
StopTime,CustomerId
from [dbcust].[dbo].[_Content]
group by StopTime
In sql server, when you group by
a column, you can't select any column unless it is in the group by clause or in an aggregate function, so in order to write it correctly it should be like this: 在sql server中, group by
列group by
时,除非在group by子句或聚合函数中,否则您不能选择任何列,因此为了正确地编写它,应如下所示:
select stopTime, MIN(CustomerId) -- just an example
from [dbcust].[dbo].[_Content]
group by StopTime
Or using your full query you can do this: 或使用完整查询,您可以执行以下操作:
SELECT stoptime, MAX(Time) AS LatestTime
FROM
(
select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0
Inner JOIN
(select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1
on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1))
) AS t
GROUP BY stoptime
This will give you the exact result that you are looking for, but only the stoptime
and the latest time. 这将为您提供所需的确切结果,但只有stoptime
时间和最新时间。
So to select the other columns as will as the two columns I used the ranking function. 因此,要选择其他列作为两列,我使用了排名功能。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.