[英]Finding the most frequent value in sql server 2012
我想找到每個客戶購買的最頻繁的產品。 我的數據集是這樣的:
CustomerID ProdID FavouriteProduct
1 A ?
1 A ?
1 A ?
1 B ?
1 A ?
1 A ?
1 A ?
1 B ?
2 A ?
2 AN ?
2 G ?
2 C ?
2 C ?
2 F ?
2 D ?
2 C ?
產品太多了,所以我不能將它們放在數據透視表中。
答案看起來像這樣:
CustomerID ProdID FavouriteProduct
1 A A
1 A A
1 A A
1 B A
1 A A
1 A A
1 A A
1 B A
2 A C
2 AN C
2 G C
2 C C
2 C C
2 F C
2 D C
2 C C
該查詢可能看起來像這樣:
Update table
set FavouriteProduct = (Select
CustomerID, Product, Max(Count(Product))
From Table
group by CustomerID, Product) FP
獲得最頻繁產品的另一種方法是使用row_number()
:
select customerid, productid,
max(case when seqnum = 1 then productid end) over (partition by customerid) as favoriteproductid
from (select customerid, productid, count(*) as cnt,
row_number() over (partition by customerid order by count(*) desc) as seqnum
from customer c
group by customerid, productid
) cp;
要完全按照問題中的描述返回行,可以嘗試使用表表達式(在示例中使用CTE)首先返回受歡迎程度排名,其中數字越大,對於每個客戶而言產品越受歡迎。
WITH RankTable AS (
SELECT
CustomerID, ProductID, COUNT(*) AS Popularity
FROM TableA
GROUP BY CustomerID, ProductID
)
然后,可以通過首先在原始表(TableA)和表表達式(RankTable)上執行內部聯接,然后使用窗口函數在FavoriteProduct列中創建值來返回完整結果表。
SELECT
P.CustomerID
, P.ProductID
, FIRST_VALUE(P.ProductID) OVER(
PARTITION BY R.CustomerID
ORDER BY R.Popularity DESC, R.ProductID) AS FavoriteProduct
FROM TableA AS P
INNER JOIN RankTable AS R
ON P.CustomerID = R.CustomerID
AND P.ProductID= R.ProductID;
多虧了尼克,我找到了一種找到最頻繁的價值的方法。 我與您分享它是如何工作的:
Select CustomerID,ProductID,Count(*) as Number
from table A
group by CustomerID,ProductID
having Count(*)>= (Select Max(Number) from (Select CustomerID,ProductID,Count(*) as Number from table B where B.CustomerID= A.CustomerID group by CustomerID,Product)C)
萬一您的SQL執行得不夠快,並且您的客戶也位於較小的表中,這可能會更好:
select C.CustomerId, R.ProductID
from Customer C
outer apply (
Select top 1 ProductID,Count(*) as Number
from table A
where A.CustomerId = C.CustomerId
group by ProductId
order by Number desc
) R
基於本頁面末尾的示例,這可能更快一些: http ://www.sql-server-performance.com/2006/find-frequent-values/:
SELECT CustomerID, ProdID, Cnt
FROM
(
SELECT CustomerID, ProdID, COUNT(*) as Cnt,
RANK() OVER (
PARTITION BY CustomerID
ORDER BY COUNT(*) DESC
) AS Rnk
FROM YourTransactionTable
GROUP BY CustomerID, ProdID
) x
WHERE Rnk = 1
這個使用了RANK()
函數。 在這種情況下,您不必重新連接到同一張表(這意味着需要做的工作要少得多)
現在要更新您的現有數據,我想將我的數據集包裝在WITH中,以使調試更容易,並且最終更新也更簡單:
;WITH
(
SELECT CustomerID, ProdID, Cnt
FROM
(
SELECT CustomerID, ProdID, COUNT(*) as Cnt,
RANK() OVER (PARTITION BY CustomerID
ORDER BY COUNT(*) DESC) AS Rnk
FROM TransactionTable
GROUP BY CustomerID, ProdID
) x
WHERE Rnk = 1
) As SRC
UPDATE FavouriteTable
SET Favourite = SRC.ProdID
FROM SRC
WHERE SRC.CustomerID = Favourite.CustomerID
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.