[英]Gap in time between dates
Question: I am trying to calculate the gap in time between trips.问题:我正在尝试计算旅行之间的时间差距。 That is time time between the end of one ride of a particular bike to the start of the next ride of a particular bike.
这是从特定自行车的一次骑行结束到特定自行车的下一次骑行开始之间的时间。
I am working with the bigquery-public-data.new_york_citibike.citibike_trips data base.我正在使用 bigquery-public-data.new_york_citibike.citibike_trips 数据库。 The data roughly the schema of:
数据大致架构如下:
trip_duration, start_time, end_time, start_station_id, end_station_id, bike_id trip_duration、start_time、end_time、start_station_id、end_station_id、bike_id
My current guess is that I need to make a couple queries:我目前的猜测是我需要做几个查询:
SELECT
bk.end_time as idle_start,
bk.bike_id as id
FROM `bigquery-public-data.new_york_citibike.citibike_trips` as bk
And和
SELECT
bk.start_time as idle_end,
bk.bike_id as id
FROM `bigquery-public-data.new_york_citibike.citibike_trips` as bk
Then I need to find a way to join them together where:然后我需要找到一种方法将它们连接在一起:
id = id and idle_start < idle_end id = id 和 idle_start < idle_end
and also calculate a new metric called gap:并计算一个称为差距的新指标:
(idle_end - idle_start) as gap (idle_end - idle_start) 作为间隙
I'm fairly new at this, so I haven't been able to come up with a solution.我对此很陌生,所以我无法提出解决方案。 I feel like this would work with joins, but I'm not very good at them yet.
我觉得这适用于连接,但我还不是很擅长。
Given table rides:给定的桌游:
CREATE TABLE rides (
ride_id int primary key auto_increment
, bike_id int
, stime int
, etime int
);
and data:和数据:
INSERT INTO rides (bike_id, stime, etime) VALUES
( 1, 1, 5 )
, ( 1, 8, 15 )
, ( 1, 26, 30 )
, ( 1, 55, 56 )
, ( 2, 11, 12 )
, ( 2, 19, 25 )
, ( 1, 88, 99 )
, ( 2, 26, 28 )
, ( 3, 5, 21 )
, ( 4, 5, 21 )
, ( 4, 55, 57 )
;
Find the gap between each ride per bike.找出每辆自行车每次骑行之间的差距。 I added logic to assign a gap of 0 for the first ride found.
我添加了逻辑来为找到的第一次骑行分配 0 的差距。 I didn't use date/times, just integers to indicate time periods.
我没有使用日期/时间,只是使用整数来表示时间段。 This can be changed to use date/time and time differences as needed:
这可以根据需要更改为使用日期/时间和时差:
WITH gaps AS (
SELECT t.*
, stime - COALESCE(LAG(etime) OVER (PARTITION BY bike_id ORDER BY stime), stime) AS gap
FROM rides AS t
)
SELECT *
FROM gaps
ORDER BY bike_id, stime
;
Result:结果:
+---------+---------+-------+-------+------+
| ride_id | bike_id | stime | etime | gap |
+---------+---------+-------+-------+------+
| 1 | 1 | 1 | 5 | 0 |
| 2 | 1 | 8 | 15 | 3 |
| 3 | 1 | 26 | 30 | 11 |
| 4 | 1 | 55 | 56 | 25 |
| 7 | 1 | 88 | 99 | 32 |
| 5 | 2 | 11 | 12 | 0 |
| 6 | 2 | 19 | 25 | 7 |
| 8 | 2 | 26 | 28 | 1 |
| 9 | 3 | 5 | 21 | 0 |
| 10 | 4 | 5 | 21 | 0 |
| 11 | 4 | 55 | 57 | 34 |
+---------+---------+-------+-------+------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.