簡體   English   中英

SQL返回連續記錄

[英]SQL return consecutive records

一個簡單的表:

ForumPost
--------------
ID (int PK)
UserID (int FK)
Date (datetime)

我要返回的是特定用戶連續n天每天至少發表1次帖子的次數。

例:

User 15844 has posted at least 1 post a day for 30 consecutive days 10 times

我已經用linq / lambda標記了這個問題,還有一個很好的解決方案。 我知道我可以通過迭代所有用戶記錄來解決此問題,但這很慢。

您可以使用ROW_NUMBER()查找一個連續的條目,想像一下下面的日期集及其row_number(從0開始):

Date        RowNumber
20130401    0
20130402    1
20130403    2
20130404    3
20130406    4
20130407    5

對於連續的條目,如果從值中減去row_number,則會得到相同的結果。 例如

Date        RowNumber   date - row_number
20130401    0           20130401
20130402    1           20130401
20130403    2           20130401
20130404    3           20130401
20130406    4           20130402
20130407    5           20130402

然后,您可以按date - row_number進行分組,以獲取連續的日期集(即前4條記錄和后2條記錄)。

要將其應用於您的示例,請使用:

WITH Posts AS
(   SELECT  FirstPost = DATEADD(DAY, 1 - ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [Date]), [Date]),
            UserID,
            Date
    FROM    (   SELECT  DISTINCT UserID, [Date] = CAST(Date AS [Date])
                FROM    ForumPost
            ) fp
), Posts2 AS
(   SELECT  FirstPost, 
            UserID, 
            Days = COUNT(*), 
            LastDate = MAX(Date)
    FROM    Posts
    GROUP BY FirstPost, UserID
)
SELECT  UserID, ConsecutiveDates = MAX(Days)
FROM    Posts2
GROUP BY UserID;

SQL Fiddle上的示例(簡單,每個用戶最多只能連續幾天)

進一步的示例說明如何獲取所有連續周期

編輯

我認為以上內容並不能完全回答問題,這將給出用戶發布的次數,或連續n天以上的發布次數:

WITH Posts AS
(   SELECT  FirstPost = DATEADD(DAY, 1 - ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [Date]), [Date]),
            UserID,
            Date
    FROM    (   SELECT  DISTINCT UserID, [Date] = CAST(Date AS [Date])
                FROM    ForumPost
            ) fp
), Posts2 AS
(   SELECT  FirstPost, 
            UserID, 
            Days = COUNT(*), 
            FirstDate = MIN(Date), 
            LastDate = MAX(Date)
    FROM    Posts
    GROUP BY FirstPost, UserID
)
SELECT  UserID, [Times Over N Days] = COUNT(*)
FROM    Posts2
WHERE   Days >= 30
GROUP BY UserID;

SQL小提琴示例

我認為您的特定應用程序使這一過程變得非常簡單。 如果您在“ n”天的間隔中有“ n”個不同的日期,則這些“ n”個不同的日期必須是連續的。

滾動到底部,以獲取僅需要公用表表達式並更改為PostgreSQL的常規解決方案。 (開玩笑。由於時間緊迫,我在PostgreSQL中實現了。)

create table ForumPost (
  ID integer primary key,
  UserID integer not null,
  post_date date not null
);

insert into forumpost values
(1, 1, '2013-01-15'),
(2, 1, '2013-01-16'),
(3, 1, '2013-01-17'),
(4, 1, '2013-01-18'),
(5, 1, '2013-01-19'),
(6, 1, '2013-01-20'),
(7, 1, '2013-01-21'),

(11, 2, '2013-01-15'),
(12, 2, '2013-01-16'),
(13, 2, '2013-01-17'),
(16, 2, '2013-01-17'),
(14, 2, '2013-01-18'),
(15, 2, '2013-01-19'),

(21, 3, '2013-01-17'),
(22, 3, '2013-01-17'),
(23, 3, '2013-01-17'),
(24, 3, '2013-01-17'),
(25, 3, '2013-01-17'),
(26, 3, '2013-01-17'),
(27, 3, '2013-01-17');

現在,讓我們看一下該查詢的輸出。 為簡便起見,我正在查看5天間隔,而不是30天間隔。

select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid;

USERID  DISTINCT_DATES  
1       5
2       5
3       1

對於符合條件的用戶,該5天間隔內的不同日期數必須為5,對嗎? 因此,我們只需要將該邏輯添加到HAVING子句中即可。

select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid
having count(distinct post_date) = 5;

USERID  DISTINCT_DATES  
1       5
2       5

更一般的解決方案

這么說真的沒有任何意義,如果您每天從2013-01-01到2013-01-31發布,那么您已經連續30天發布了2次。 相反,我希望時鍾從2013年1月31日開始。 我很抱歉在PostgreSQL中實現; 稍后我將嘗試在T-SQL中實現。

with first_posts as (
  select userid, min(post_date) first_post_date
  from forumpost
  group by userid
), 
period_intervals as (
  select userid, first_post_date period_start, 
         (first_post_date + interval '4' day)::date period_end
  from first_posts
), user_specific_intervals as (
  select 
    userid, 
    (period_start + (n || ' days')::interval)::date as period_start, 
    (period_end + (n || ' days')::interval)::date as period_end 
  from period_intervals, generate_series(0, 30, 5) n
)
select userid, period_start, period_end, 
       (select count(distinct post_date) 
        from forumpost
        where forumpost.post_date between period_start and period_end
          and userid = forumpost.userid) distinct_dates
from user_specific_intervals
order by userid, period_start;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM