I have the following table which has the drivers and riders details captured. For each day( datetime
) there is one driver and zero or more riders. If there are more than one rider, for each rider the data ( name of rider and age of rider) is captured in a new row with the same datetime
. This may not be the right way to structure the data, but it is so primarily due to the varying number of riders per driver per datetime
id datetime driver age riders rider_name | rider_age
---|------------|--------|------|--------|------------|---
1 | 03/03/2009 | joe | 24 | 0 | |
2 | 04/03/2009 | john | 39 | 1 | juliet | 30
3 | 05/03/2009 | borat | 32 | 2 | jane | 45
4 | 05/03/2009 | | | | mike | 18
5 | 06/03/2009 | john | 39 | 3 | duke | 42
6 | 06/03/2009 | | | | jose | 33
7 | 06/03/2009 | | | | kyle | 24
For each datetime value, need the driver, age, number of riders, name of youngest rider and number of riders within +/- 10 years of the driver
datetime driver age riders youngest_rider riders_within_ten_years_of_driver
------------|--------|------|--------|--------------|---
03/03/2009 | joe | 24 | 0 | | 0 # no rider
04/03/2009 | john | 39 | 1 | juliet | 1 # juliet
05/03/2009 | borat | 32 | 2 | mike | 0 # no rider
06/03/2009 | john | 39 | 3 | kyle | 2 # duke, jose
This is a very bad data structure, because the driver name is empty, so you don't have a key for aggregation. A more normalized structure is better, but sometimes we are stuck with a particular format.
You need to get the id of the driver record for each row. For this, use a correlated subquery:
select r.*,
(select max(r2.id)
from riders r2
where r2.id <= r.id and r2.driver is not null
) as driver_id
from riders r;
Then we build on this using a join
to get the driver information and conditional aggregation. For everything but the driver with the minimum age:
select datetime,
max(case when id = driver_id then driver end) as driver,
max(case when id = driver_id then age end) as age,
max(case when id = driver_id then riders end) as riders,
sum(case when abs(rider_age - age) <= 10 then 1 else 0 end) as riders_within_10_years
from (select r.*,
(select max(r2.id)
from riders r2
where r2.id <= r.id and r2.driver is not null
) as driver_id
from riders r
) r
group by datetime, driver_id;
The riders with the minimum age is quite tricky with this data structure. One solution is to use a CTE:
with r as (
select r.*,
(select max(r2.id)
from riders r2
where r2.id <= r.id and r2.driver is not null
) as driver_id
from riders r
)
select datetime,
max(case when id = driver_id then driver end) as driver,
max(case when id = driver_id then age end) as age,
max(case when id = driver_id then riders end) as riders,
sum(case when abs(rider_age - age) <= 10 then 1 else 0 end) as riders_within_10_years,
(select r2.rider_name
from r r2
where r2.driver_id = r.driver_id
order by r2.rider_age desc
limit 1
) as minimum_age_rider
from r
group by datetime, driver_id;
This is much harder than it needs to be because (1) the data structure is not very good and (2) SQLite is not particularly powerful (it doesn't support window functions, especially).
If you provide data inserts, I can try if this query works.
select datetime, driver, age, max(riders)
,max(first_value(rider_name) over (partition by datetime, driver, age order by rider_age, rider_name)) youngest_rider
, count (case when rider_age between age -10 and age + 10
then 1
else 0
end
) count_riders_in_age_grp
from table
group by datetime, driver, age
This is a terrible database structure, but I'm assuming it's a homework question. Regardless, this should work:
SELECT [DateTime],
MAX(driver) AS [Driver],
MAX(AGE) AS [Age],
MAX(riders) AS [Riders],
t.rider_name AS [Youngest Rider],
ISNULL(SUM(CASE WHEN rider_age BETWEEN MAX(AGE)- 10 AND MAX(AGE) + 10 THEN 1 ELSE 0 END), 0) AS [Riders within Ten Years of Driver]
FROM my_table M
CROSS APPLY
(
SELECT rider_name
FROM my_table
WHERE DateTime = M.DateTime
AND rider_age = (SELECT MIN(rider_age) FROM my_table WHERE DateTime = M.DateTime)
) t
GROUP BY M.DateTime, t.rider_name
SELECT
datetime
,max(driver) as driver
,max(age) as age
,max(riders) as riders
,first_value(rider_name) OVER
(PARTITION BY datetime
ORDER BY rider_age
rows unbounded preceding)
as youngest_rider
,count(b.id) as riders_within_ten_years_of_driver
FROM
my_table a
LEFT JOIN
my_table b
ON
a.datetime = b.datetime
AND a.age - b.rider_age between -10 AND 10
GROUP BY
datetime
,youngest_rider
This is a mess. It would be much simpler if you had a table for drivers, riders and rides.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.