[英]SQL: Union or Self Join
我有一個簡單的表:user(id,date,task)
任務字段包含“下載”或“上傳”
我想弄清楚每天執行每個操作的用戶數量。
輸出 :日期,下載的用戶數,上傳的用戶數
我首先遇到了在select的合計計數函數中使用子查詢的問題,因此我認為我應該在此處使用自連接來分解“任務”列中的數據。
我以為可以為每種情況創建表格,然后將它們合並並計數,但是我很難解決這個問題:
SELECT id,日期,任務作為task_download從用戶WHERE任務='下載'
SELECT id,日期,任務作為task_upload來自用戶WHERE task ='upload'
select `date`,
COUNT( distinct CASE WHEN task = 'download' then id end ) 'download',
COUNT( distinct CASE WHEN task = 'upload' then id end ) 'upload'
from user
group by `date`
我會說,也不是。 像這樣的查詢就可以完成工作:
select `date`,
count(distinct case when task = 'download' then id else null end) as downloads,
count(distinct case when task = 'upload' then id else null end) as uploads
from user
where task in ('download', 'upload')
group by `date`
假設date
是僅包含日期部分而不包含完整時間戳的列,而id
是用戶ID。 您可以在聚合函數中使用distinct
關鍵字,這就是我在這里所做的。
為了使此查詢運行得很快,我建議對task,date
使用索引
但是,如果date
包含完整的時間戳記(即包括時間部分),則您需要進行不同的分組:
select `date`,
count(distinct case when task = 'download' then id else null end) as downloads,
count(distinct case when task = 'upload' then id else null end) as uploads
from user
where task in ('download', 'upload')
group by date(`date`)
您可以使用子查詢來執行此操作,例如:
SELECT `date` AS `day`,
(SELECT COUNT(*) FROM activity WHERE date = day AND activity = 'upload') AS upload_count,
(SELECT COUNT(*) FROM activity WHERE date = day AND activity = 'download') AS download_count
FROM activity
GROUP BY date;
這是SQL Fiddle 。
首先按日期和任務計算不同的用戶,然后根據每個任務的日期對用戶求和。
select date,
sum(case when task = 'upload' then num_users else 0 end) as "upload",
sum(case when task = 'download' then num_users else 0 end) as "download"
from (
select date, task, count(distinct id) num_users
from usert
group by date, task
) x
group by date
;
在這里檢查: http : //rextester.com/ZACFB64945
如果您想要不同的用戶,則建議使用count(distinct)
:
SELECT date,
COUNT(DISTINCT CASE WHEN task = 'upload' THEN userid END) as uploads,
COUNT(DISTINCT CASE WHEN task = 'download' THEN userid END) as downloads
FROM user
GROUP BY date
ORDER BY date;
如果要采取不同的操作,則可以按照以下步驟進行:
SELECT date,
SUM( (task = 'upload')::int ) as uploads,
SUM( (task = 'download')::int) as downloads
FROM user
GROUP BY date
ORDER BY date;
這使用便利的Postgres速記來計算布爾表達式。
我將使用條件聚合。
為了獲得在給定日期至少執行一次上載的用戶數量的計數(但即使該用戶在同一日期執行了多次上載,但對於該日期該用戶只將計數增加一個),我們可以使用COUNT(DISTINCT user)
表達式。
要計算上傳總數,我們可以使用COUNT或SUM。
SELECT DATE(t.date) AS `date`
, COUNT(DISTINCT IF(t.task='upload' ,t.user,NULL)) AS cnt_users_who_uploaded
, COUNT(DISTINCT IF(t.task='download',t.user,NULL)) AS cnt_users_who_downloaded
, SUM(IF(t.task='upload' ,1,0)) AS cnt_uploads
, SUM(IF(t.task='download',1,0)) AS cnt_downloads
FROM user t
GROUP BY DATE(t.date)
ORDER BY DATE(t.date)
注意:對於日期沒有表格中沒有出現該date
行,這不會返回零計數。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.