简体   繁体   English

计算平均时间间隔长度

[英]Calculating average time interval length

I have prepared a simple SQL Fiddle demonstrating my problem - 我准备了一个简单的SQL Fiddle来演示我的问题 -

In PostgreSQL 10.3 I store user information, two-player games and the moves in the following 3 tables: 在PostgreSQL 10.3中,我存储用户信息,双人游戏以及以下3个表中的移动:

CREATE TABLE players (
    uid SERIAL PRIMARY KEY,
    name text NOT NULL
);

CREATE TABLE games (
    gid SERIAL PRIMARY KEY,
    player1 integer NOT NULL REFERENCES players ON DELETE CASCADE,
    player2 integer NOT NULL REFERENCES players ON DELETE CASCADE
);

CREATE TABLE moves (
    mid BIGSERIAL PRIMARY KEY,
    uid integer NOT NULL REFERENCES players ON DELETE CASCADE,
    gid integer NOT NULL REFERENCES games ON DELETE CASCADE,
    played timestamptz NOT NULL
);

Let's assume that 2 players, Alice and Bob have played 3 games with each other: 让我们假设2名球员,爱丽丝和鲍勃互相打了3场比赛:

INSERT INTO players (name) VALUES ('Alice'), ('Bob');
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);

And let's assume that the 1st game was played quickly, with moves being played every minute. 让我们假设第一场比赛很快就开始了,每分钟都会进行一次动作。

But then they chilled :-) and played 2 slow games, with moves every 10 minutes: 但随后他们冷静下来:-)并且玩了2场慢速比赛,每10分钟就有一次动作:

INSERT INTO moves (uid, gid, played) VALUES
(1, 1, now() + interval '1 min'),
(2, 1, now() + interval '2 min'),
(1, 1, now() + interval '3 min'),
(2, 1, now() + interval '4 min'),
(1, 1, now() + interval '5 min'),
(2, 1, now() + interval '6 min'),

(1, 2, now() + interval '10 min'),
(2, 2, now() + interval '20 min'),
(1, 2, now() + interval '30 min'),
(2, 2, now() + interval '40 min'),
(1, 2, now() + interval '50 min'),
(2, 2, now() + interval '60 min'),

(1, 3, now() + interval '110 min'),
(2, 3, now() + interval '120 min'),
(1, 3, now() + interval '130 min'),
(2, 3, now() + interval '140 min'),
(1, 3, now() + interval '150 min'),
(2, 3, now() + interval '160 min');

At a web page with gaming statistics I would like to display average time passing between moves for each player. 在具有游戏统计数据的网页上,我想显示每个玩家的移动之间的平均时间。

So I suppose I have to use the LAG window function of PostgreSQL. 所以我想我必须使用PostgreSQL的LAG窗口功能

Since several games can be played simultaneously, I am trying to PARTITION BY gid (ie by the "game id"). 由于几个游戏可以同时播放,我正试图通过PARTITION BY gid进行PARTITION BY gid (即通过“游戏ID”)。

Unfortunately, I get a syntax error window function calls cannot be nested with my SQL query: 不幸的是,我得到一个语法错误窗口函数调用不能嵌套我的SQL查询:

SELECT AVG(played - LAG(played) OVER (PARTITION BY gid order by played))
OVER (PARTITION BY gid order by played)
FROM moves
-- trying to calculate average thinking time for player Alice
WHERE uid = 1;

UPDATE: 更新:

Since the number of games in my database is large and grows day by day, I have tried (here the new SQL Fiddle ) adding a condition to the inner select query: 由于我的数据库中的游戏数量很大并且日益增长,我尝试(这里是新的SQL Fiddle )为内部选择查询添加一个条件:

SELECT AVG(played - prev_played)
FROM (SELECT m.*,
      LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
      FROM moves m
      JOIN games g ON (m.uid in (g.player1, g.player2))
      WHERE m.played > now() - interval '1 month'
     ) m
WHERE uid = 1;

However for some reason this changes the returned value quite radically to 1 min 45 sec. 但是由于某种原因,这会将返回值彻底改变为1分45秒。

And I wonder, why does the inner SELECT query suddenly return much more rows, is maybe some condition missing in my JOIN? 我想知道,为什么内部SELECT查询突然返回更多行,可能是我的JOIN中缺少一些条件?

UPDATE 2: 更新2:

Oh ok, I get why the average value decreases: through multiple rows with same timestamps (ie played - prev_played = 0 ), but how to fix the JOIN? 哦,好吧,我得到平均值减少的原因:通过具有相同时间戳的多行(即played - prev_played = 0 ),但如何修复JOIN?

UPDATE 3: 更新3:

Nevermind, I was missing the m.gid = g.gid AND condition in my SQL JOIN, now it works : 没关系,我错过了我的SQL JOIN中的m.gid = g.gid AND条件, 现在它可以工作

SELECT AVG(played - prev_played)
FROM (SELECT m.*,
      LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
      FROM moves m
      JOIN games g ON (m.gid = g.gid AND m.uid in (g.player1, g.player2))
      WHERE m.played > now() - interval '1 month'
     ) m
WHERE uid = 1;

You need subqueries to nest the window functions. 您需要子查询来嵌套窗口函数。 I think this does what you want: 我认为这样做你想要的:

select avg(played - prev_played)
from (select m.*,
             lag(m.played) over (partition by gid order by played) as prev_played
      from moves m
     ) m
where uid = 1;

Note: The where needs to go in the outer query, so it doesn't affect the lag() . 注意: where需要在外部查询中去,所以它不会影响lag()

Probably @gordon answer is good enough. 可能@gordon的答案已经足够了。 But that isn't the result you ask in your comment. 但这不是您在评论中提出的结果。 Only works because the data have same number of rows for each game so average of games is the same as complete average. 仅适用,因为每个游戏的数据行数相同,因此游戏的平均值与完全平均值相同。 But if you want average of the games you need one additional level. 但如果你想要平均的游戏,你需要一个额外的水平。

With cte as (
    SELECT gid, AVG(played - prev_played) as play_avg
    FROM (select m.*,
                 lag(m.played) over (partition by gid order by played) as prev_played
          from moves m      
         ) m
    WHERE uid = 1
    GROUP BY gid
)
   SELECT AVG(play_avg)
   FROM cte
;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM