简体   繁体   English

SQL使用OVER和PARTITION BY

[英]SQL use of OVER and PARTITION BY

I have the following table; 我有下表。

ClientID    | Location  | Episode   | Date  
001         | Area1     | 4         | 01Dec16  
001         | Area2     | 3         | 01Nov16  
001         | Area2     | 2         | 01Oct16  
001         | Area1     | 1         | 01Sep16  
002         | Area2     | 3         | 21Dec16  
002         | Area1     | 2         | 21Nov16  
002         | Area1     | 1         | 21Oct16    

And I'm looking to create 2 new columns based to the latest episode of the client 我正在根据客户的最新情节创建2个新列

ClientID    | Location  | Episode   | Date  | LatestEpisode     | LatestLocation   
001         | Area1     | 4         | Dec   | 4                 | Area1  
001         | Area2     | 3         | Nov   | 4                 | Area1   
001         | Area2     | 2         | Oct   | 4                 | Area1  
001         | Area1     | 1         | Sep   | 4                 | Area1  
002         | Area2     | 3         | Dec   | 3                 | Area2  
002         | Area1     | 2         | Nov   | 3                 | Area2  
002         | Area1     | 1         | Oct   | 3                 | Area2      

I have worked out I can use OVER to work out the LatestEspisode: LatestEpisode = MAX(Episode) OVER(PARTITION BY ClientID) 我已经计算出可以使用OVER来计算LatestEspisode: LatestEpisode = MAX(Episode) OVER(PARTITION BY ClientID)

But can't work out how to get the LatestLocation? 但是无法解决如何获取LatestLocation吗?

EDIT: Sorry if I haven't got the format right, this is my first post. 编辑:对不起,如果我没有正确的格式,这是我的第一篇文章。 I was trying to look at how to post correctly but I found it quite confusing 我试图查看如何正确发布,但发现它很混乱

I have searched stackoverflow many times over the last 3 days and have found various ways using OVER and ROW NUMBER() but I don't have a lot of experience of them. 在过去的3天里,我多次搜索了stackoverflow,并发现了使用OVERROW NUMBER()各种方式,但是我对它们没有太多的经验。 Many of the examples I had found previously were fine for producing an aggregated table but I want to keep the full table, this is why I thought using OVER was the way to go. 我以前发现的许多示例都适合生成聚合表,但我想保留整个表,这就是为什么我认为使用OVER是必经之路的原因。

Sql server 2012 version introduced the FIRST_VALUE() function, That enables you to write your select query like this: SQL Server 2012版本引入了FIRST_VALUE()函数,该函数使您可以像下面这样编写选择查询:

SELECT  ClientID, 
        Location, 
        Episode, 
        [Date], 
        LatestEpisode = FIRST_VALUE(Episode) OVER(PARTITION BY ClientID ORDER BY [Date] DESC), 
        LatestLocation = FIRST_VALUE(Location) OVER(PARTITION BY ClientID ORDER BY [Date] DESC) 
FROM tableName

In SQL Server, I would do this with cross apply : 在SQL Server中,我可以使用cross apply来做到这一点:

select e.*, e2.episode as LatestEpisode, e2.location as LatestLocation
from episodes e cross apply
     (select top 1 e2.*
      from episodes e2
      where e2.clientId = e.clientId
      order by e2.episode desc
     ) elast;

Although you can express this logic with window functions, the lateral join (implemented in SQL Server using the apply keyword) is more natural way of expressing the logic. 尽管可以使用窗口函数来表达此逻辑,但是横向联接(在SQL Server中使用apply关键字实现)是表达逻辑的更自然的方式。

If you are not familiar with lateral joins, you can think of them as a correlated subqueries in the from clause . 如果您不熟悉横向联接,则可以在from子句中将它们视为相关的子查询。 . . but queries that allow you to return multiple columns. 但是查询允许您返回多个列。 I should add, though, that one of the main use cases is for table-valued functions, so it is a very powerful construct. 不过,我应该补充一点,主要用例之一是表值函数,因此它是一个非常强大的构造。

First, you need to select LatestEpisode per each client and then you can use this value to identify row, where you can get LatestLocation from 首先,您需要为每个客户端选择LatestEpisode ,然后可以使用此值来标识行,从中可以从中获取LatestLocation

SELECT *
    ,(
        SELECT Location
        FROM Episodes
        WHERE ClientId = MyTable.ClientId
            AND Episode = MyTable.LatestEpisode
        ) AS LatestLocation
FROM (
    SELECT *
        ,MAX(Episode) OVER (PARTITION BY ClientId) AS LatestEpisode
    FROM Episodes
    ) AS MyTable

You can also use common table expression (CTE): 您还可以使用公用表表达式(CTE):

WITH cte
AS (
    SELECT *
        ,MAX(Episode) OVER (PARTITION BY ClientId) AS LatestEpisode
    FROM Episodes
    )
SELECT cte.*
    ,(
        SELECT Location
        FROM Episodes
        WHERE ClientId = cte.ClientId
            AND Episode = cte.LatestEpisode
        ) AS LatestLocation
FROM cte

I have worked on it and able to produce the required result Please try below 我已经对其进行了处理,并且能够产生所需的结果,请尝试以下操作

Declare @Table table ( ClientID varchar(max), Location varchar(500), Episode int, Dated varchar(30)) 

Insert Into @Table 
Values ('001', 'Area1', 4 ,'01Dec16' )
,('001', 'Area2', 3, '01Nov16')
, ('001', 'Area2', 2, '01Oct16')  
,('001' ,'Area1' ,1, '01Sep16')
,('002' ,'Area2' ,3, '21Dec16') 
,('002' ,'Area1' ,2, '21Nov16') 
,('002' ,'Area1' ,1, '21Oct16') 


; WITH LL AS 
(
SELECT CLientID ,MAX(CAST (Dated as Date)) as maxdate
FROM @table 
GROUP BY ClientID
) 
, Area AS 
(
SELECT Location, x.ClientID, x.Dated FROM @Table x INNER JOIN LL b ON x.ClientID = b.ClientID AND x.Dated = b.maxdate
) 
SELECT a.*
, LatestEpisode = MAX(Episode) OVER(PARTITION BY a.ClientID)
, LatestLocation =  b.Location
FROM @Table a 
INNER JOIN Area b ON a.ClientID = b.ClientID

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM