The question is how long have these customers been jerks on any given date.
I'm working against Sybase
For this simplified table structure of table history_data
table: history_of_jerkiness processing_date name is_jerk --------------- ----- ------- 20090101 Matt true 20090101 Bob false 20090101 Alex true 20090101 Carol true 20090102 Matt true 20090102 Bob true 20090102 Alex false 20090102 Carol true 20090103 Matt true 20090103 Bob true 20090103 Alex true 20090103 Carol false
The report for the 3rd should show that Matt has always been a jerk, Alex has just become a jerk, and Bob has been a jerk for 2 days.
name days jerky ----- ---------- Matt 3 Bob 2 Alex 1
I'd like to find these spans of time dynamically, so if I run the report for the 2nd, I should get different results:
name days_jerky ----- ---------- Matt 2 Bob 1 Carol 2
The key here is trying to find only continuous spans older than a certain date. I've found a few leads, but it seems like a problem where there would be very smart tricky solutions.
My solution from SQL Server - same as Dems but I put in a min baseline myself. It assumes there are no gaps - that is there is an entry for each day for each person. If that isn't true then I'd have to loop.
DECLARE @run_date datetime
DECLARE @min_date datetime
SET @run_date = {d '2009-01-03'}
-- get day before any entries in the table to use as a false baseline date
SELECT @min_date = DATEADD(day, -1, MIN(processing_date)) FROM history_of_jerkiness
-- get last not a jerk date for each name that is before or on the run date
-- the difference in days between the run date and the last not a jerk date is the number of days as a jerk
SELECT [name], DATEDIFF(day, MAX(processing_date), @run_date)
FROM (
SELECT processing_date, [name], is_jerk
FROM history_of_jerkiness
UNION ALL
SELECT DISTINCT @min_date, [name], 0
FROM history_of_jerkiness ) as data
WHERE is_jerk = 0
AND processing_date <= @run_date
GROUP BY [name]
HAVING DATEDIFF(day, MAX(processing_date), @run_date) > 0
I created the test table with the following:
CREATE TABLE history_of_jerkiness (processing_date datetime, [name] varchar(20), is_jerk bit)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Bob', 0)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Alex', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Carol', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Bob', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Alex', 0)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Carol', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Bob', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Alex', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Carol', 0)
"This can be made simple if you structure the data so as to meet the following criteria...
All people must have an initial record where they are not a jerk"
What criteria the data should and should not meet is up to the user, not to the developer.
This can be made simple if you structure the data so as to meet the following criteria...
All people must have an initial record where they are not a jerk
You can do something like...
SELECT
name,
MAX(date) last_day_jerk_free
FROM
jerkiness AS [data]
WHERE
jerk = 'false'
AND date <= 'a date'
GROUP BY
name
You already know what the base date is ('a date'), and now you know the last day they were not a jerk. I don't know sybase but I'm sure there are commands you can use to get the number of days between 'a data' and 'last_day_jerk_free'
EDIT:
There are multiple ways of artificially creating an initialising "not jerky" record. The one suggested by Will Rickards uses a sub query containing a union. Doing so, however, has two down sides...
1. The sub-query masks any indexes which may otherwise have been used
2. It assumes all people have data starting from the same point
Alternatively, take Will Rickard's suggestion and move the aggregation from the outer query into the inner query (so maximising use of indexes), and union with a generalised 2nd sub query to create the starting jerky = false record...
SELECT name, DATEDIFF(day, MAX(processing_date), @run_date) AS days_jerky
FROM (
SELECT name, MAX(processing_date) as processing_date
FROM history_of_jerkiness
WHERE is_jerk = 0 AND processing_date <= @run_date
GROUP BY name
UNION
SELECT name, DATEADD(DAY, -1, MIN(processing_date))
FROM history_of_jerkiness
WHERE processing_date <= @run_date
GROUP BY name
) as data
GROUP BY
name
The outer query still has to do a max without indexes, but over a reduced number of records (2 per name, rather than n per name). The number of records is also reduced by not requiring every name to have a value for every date in use. There are many other ways of doing this, some can be seen in my edit history.
How about this:
select a.name,count(*) from history_of_jerkiness a
left join history_of_jerkiness b
on a.name = b.name
and a.processing_date >= b.processing_date
and a.is_jerk = 'true'
where not exists
( select * from history_of_jerkiness c
where a.name = c.name
and c.processing_date between a.processing_date and b.processing_date
and c.is_jerk = 'false'
)
and a.processing_date <= :a_certain_date;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.