簡體   English   中英

SQL:聯合或自聯接

[英]SQL: Union or Self Join

我有一個簡單的表:user(id,date,task)

任務字段包含“下載”或“上傳”

我想弄清楚每天執行每個操作的用戶數量。

輸出 :日期,下載的用戶數,上傳的用戶數

我首先遇到了在select的合計計數函數中使用子查詢的問題,因此我認為我應該在此處使用自連接來分解“任務”列中的數據。

我以為可以為每種情況創建表格,然后將它們合並並計數,但是我很難解決這個問題:

SELECT id,日期,任務作為task_download從用戶WHERE任務='下載'

SELECT id,日期,任務作為task_upload來自用戶WHERE task ='upload'

select  `date`, 
COUNT( distinct CASE WHEN task = 'download' then id end ) 'download', 
COUNT( distinct CASE WHEN task = 'upload' then id end ) 'upload'
from user
group by  `date`

我會說,也不是。 像這樣的查詢就可以完成工作:

select `date`, 
    count(distinct case when task = 'download' then id else null end) as downloads, 
    count(distinct case when task = 'upload' then id else null end) as uploads
from user
where  task in ('download', 'upload')
group by `date`

假設date是僅包含日期部分而不包含完整時間戳的列,而id是用戶ID。 您可以在聚合函數中使用distinct關鍵字,這就是我在這里所做的。

為了使此查詢運行得很快,我建議對task,date使用索引

但是,如果date包含完整的時間戳記(即包括時間部分),則您需要進行不同的分組:

select `date`, 
    count(distinct case when task = 'download' then id else null end) as downloads, 
    count(distinct case when task = 'upload' then id else null end) as uploads
from user
where  task in ('download', 'upload')
group by date(`date`)

您可以使用子查詢來執行此操作,例如:

SELECT `date` AS `day`,
(SELECT COUNT(*) FROM activity WHERE date = day AND activity = 'upload') AS upload_count,
(SELECT COUNT(*) FROM activity WHERE date = day AND activity = 'download') AS download_count
FROM activity
GROUP BY date;

這是SQL Fiddle

首先按日期和任務計算不同的用戶,然后根據每個任務的日期對用戶求和。

select date,
       sum(case when task = 'upload' then num_users else 0 end) as "upload",
       sum(case when task = 'download' then num_users else 0 end) as "download"
from  (       
       select   date, task, count(distinct id) num_users
       from     usert
       group by date, task
      ) x
group by date
;

在這里檢查: http : //rextester.com/ZACFB64945

如果您想要不同的用戶,則建議使用count(distinct)

SELECT date, 
       COUNT(DISTINCT CASE WHEN task = 'upload' THEN userid END) as uploads,
       COUNT(DISTINCT CASE WHEN task = 'download' THEN userid END) as downloads
FROM user
GROUP BY date
ORDER BY date;

如果要采取不同的操作,則可以按照以下步驟進行:

SELECT date, 
       SUM( (task = 'upload')::int ) as uploads,
       SUM( (task = 'download')::int) as downloads
FROM user
GROUP BY date
ORDER BY date;

這使用便利的Postgres速記來計算布爾表達式。

我將使用條件聚合。

為了獲得在給定日期至少執行一次上載的用戶數量的計數(但即使該用戶在同一日期執行了多次上載,但對於該日期該用戶只將計數增加一個),我們可以使用COUNT(DISTINCT user)表達式。

要計算上傳總數,我們可以使用COUNT或SUM。

SELECT DATE(t.date) AS `date`
     , COUNT(DISTINCT IF(t.task='upload'  ,t.user,NULL)) AS cnt_users_who_uploaded
     , COUNT(DISTINCT IF(t.task='download',t.user,NULL)) AS cnt_users_who_downloaded
     , SUM(IF(t.task='upload'  ,1,0))                    AS cnt_uploads
     , SUM(IF(t.task='download',1,0))                    AS cnt_downloads
  FROM user t
 GROUP BY DATE(t.date)
 ORDER BY DATE(t.date)

注意:對於日期沒有表格中沒有出現該date行,這不會返回零計數。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM