简体   繁体   English

选择一个不同的列,IE Finetuning选择查询

[英]Selecting one distinct column, I.E Finetuning select query

I use the bellow query to merge 1 (or possibly sevral tables and is generated from a php script I have written) on Id's that are not the same but parts of Id in first table is part of the second Id so thats why I do a Substring on it. 我使用波纹管查询在ID上合并1(或者可能是几个表,并且是从我编写的php脚本生成的),这些ID并不相同,但是第一个表中的Id部分是第二个Id的一部分,因此我为什么要做一个子串就可以了。 In the most perfect world I would just left join on id in both tables if there were the same but they're not. 在最完美的世界中,如果存在相同的表,那么我只会在两个表中保留id的联接,但事实并非如此。

select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0 
Inner JOIN 
(select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1 
        on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) ORDER BY t1.Time DESC

Here I get Alot of fields that are the same except for StopTime, see example bellow: 在这里,我得到了很多除了StopTime之外都相同的字段,请参见下面的示例:

       StopTime                   Time
2015-04-01 23:59:00.000    2015-04-18 23:00:01
2015-04-02 23:59:00.000    2015-04-18 23:00:01
2015-04-03 23:59:00.000    2015-04-18 23:00:01
2015-04-04 23:59:00.000    2015-04-18 23:00:01
2015-04-05 23:59:00.000    2015-04-18 23:00:01
2015-04-06 23:59:00.000    2015-04-18 23:00:01
2015-04-07 23:59:00.000    2015-04-18 23:00:01
2015-04-08 23:59:00.000    2015-04-18 23:00:01
2015-04-09 23:59:00.000    2015-04-18 23:00:01
2015-04-10 23:59:00.000    2015-04-18 23:00:01
2015-04-11 23:59:00.000    2015-04-18 23:00:01
2015-04-12 23:59:00.000    2015-04-18 23:00:01
2015-04-13 23:59:00.000    2015-04-18 23:00:01
2015-04-14 23:59:00.000    2015-04-18 23:00:01
2015-04-15 23:59:00.000    2015-04-18 23:00:01
2015-04-16 23:59:00.000    2015-04-18 23:00:01
2015-04-17 23:59:00.000    2015-04-18 23:00:01
2015-04-18 23:59:00.000    2015-04-18 23:00:01

But here I only want unique Time, is it possible to get a row with the unique Time and the latest StopTime? 但是在这里我只想要唯一的时间,是否可以将唯一的时间和最新的StopTime排在一起?

Like the bellow? 像风箱一样?

        StopTime                   Time
2015-04-01 23:59:00.000    2015-04-18 23:00:01

I tried with a group by statement inside the second select statement like: 我尝试在第二个select语句中使用group by语句,例如:

(select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content] group by StopTime)

But I get a syntax error 但是我收到语法错误

Column 'dbcust.dbo._Content.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. 列'dbcust.dbo._Content.Id'在选择列表中无效,因为它既不包含在聚合函数中,也不包含在GROUP BY子句中。

Perhaps you guys can help me with finetuning my select query in order to speed up the data gathering? 也许你们可以帮助我优化选择查询以加快数据收集速度? =) =)

Thanks in advance. 提前致谢。

You can do this: 你可以这样做:

WITH YourQuery
AS
(
    select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
    from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0 
    Inner JOIN 
    (select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1 
            on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) 
), Ranked
AS
(
   select Id,
     CustomerName,
     Region,
     Country,
     StopTime,
     CustomerId,
     Id, 
     Time,
     ROW_NUMBER() OVER(PARTITION BY Time ORDER BY StopTime DESC) AS R1
    from  YourQuery
)
SELECT Id,
     CustomerName,
     Region,
     Country,
     StopTime,
     CustomerId,
     Id, 
     Time
FROM Ranked
WHERE RN = 1;

The ROW_NUMBER function will give a ranking number for each time , so selecting where rn = 1 will give you the latest stoptime . ROW_NUMBER函数将给予每个一个等级数time ,所以选择where rn = 1会给你最新的stoptime


For the query you tried: 对于您尝试的查询:

select distinct Id,CustomerName,Region,Country, 
   StopTime,CustomerId 
from [dbcust].[dbo].[_Content] 
group by StopTime

In sql server, when you group by a column, you can't select any column unless it is in the group by clause or in an aggregate function, so in order to write it correctly it should be like this: 在sql server中, group bygroup by时,除非在group by子句或聚合函数中,否则您不能选择任何列,因此为了正确地编写它,应如下所示:

select stopTime, MIN(CustomerId) -- just an example
from [dbcust].[dbo].[_Content] 
group by StopTime

Or using your full query you can do this: 或使用完整查询,您可以执行以下操作:

SELECT stoptime, MAX(Time) AS LatestTime
FROM
(
    select t0.Id,t0.CustomerName,t0.Region,t0.Country,t0.StopTime,t0.CustomerId,t1.Id, t1.Time
            from (select distinct Id,CustomerName,Region,Country, StopTime,CustomerId from [dbcust].[dbo].[_Content]) t0 
            Inner JOIN 
            (select distinct Id, Time from [dbcust].[dbo].[_Cpu]) t1 
                    on SUBSTRING(t1.Id,CHARINDEX('_',t1.Id,10)+1,(CHARINDEX('_',t1.Id,15) - CHARINDEX('_',t1.Id,10)-1))=SUBSTRING(t0.Id,CHARINDEX('_',t0.Id,10)+1,(CHARINDEX('_',t0.Id,15) - CHARINDEX('_',t0.Id,10)-1)) 
) AS t
GROUP BY stoptime

This will give you the exact result that you are looking for, but only the stoptime and the latest time. 这将为您提供所需的确切结果,但只有stoptime时间和最新时间​​。

So to select the other columns as will as the two columns I used the ranking function. 因此,要选择其他列作为两列,我使用了排名功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM