简体   繁体   English

SQL 日期查询 - 此条件成立多长时间

[英]SQL Dates Query - how long has this condition been true

The question is how long have these customers been jerks on any given date.问题是这些客户在任何给定日期都是混蛋多久。

I'm working against Sybase我正在对抗 Sybase

For this simplified table structure of table history_data对于这个表history_data的简化表结构

table: history_of_jerkiness
processing_date  name  is_jerk
---------------  ----- -------
20090101         Matt  true
20090101         Bob   false        
20090101         Alex  true        
20090101         Carol true        
20090102         Matt  true        
20090102         Bob   true        
20090102         Alex  false        
20090102         Carol true        
20090103         Matt  true        
20090103         Bob   true        
20090103         Alex  true        
20090103         Carol false        

The report for the 3rd should show that Matt has always been a jerk, Alex has just become a jerk, and Bob has been a jerk for 2 days.第 3 次的报告应该显示,马特一直是个混蛋,亚历克斯刚刚成为一个混蛋,鲍勃已经混蛋了 2 天。

name    days jerky
-----   ----------
Matt    3
Bob     2
Alex    1

I'd like to find these spans of time dynamically, so if I run the report for the 2nd, I should get different results:我想动态地找到这些时间跨度,所以如果我第二次运行报告,我应该得到不同的结果:

name    days_jerky
-----   ----------
Matt    2
Bob     1
Carol   2

The key here is trying to find only continuous spans older than a certain date.这里的关键是试图只找到比某个日期更早的连续跨度。 I've found a few leads, but it seems like a problem where there would be very smart tricky solutions.我找到了一些线索,但这似乎是一个有非常聪明的棘手解决方案的问题。

My solution from SQL Server - same as Dems but I put in a min baseline myself.我来自 SQL 服务器的解决方案 - 与 Dems 相同,但我自己设置了一个最小基线。 It assumes there are no gaps - that is there is an entry for each day for each person.它假设没有间隙——也就是说,每个人每天都有一个条目。 If that isn't true then I'd have to loop.如果那不是真的,那么我将不得不循环。

DECLARE @run_date datetime
DECLARE @min_date datetime

SET @run_date = {d '2009-01-03'}

-- get day before any entries in the table to use as a false baseline date
SELECT @min_date = DATEADD(day, -1, MIN(processing_date)) FROM history_of_jerkiness

-- get last not a jerk date for each name that is before or on the run date
-- the difference in days between the run date and the last not a jerk date is the number of days as a jerk
SELECT [name], DATEDIFF(day, MAX(processing_date), @run_date)
FROM (
     SELECT processing_date, [name], is_jerk
     FROM history_of_jerkiness
     UNION ALL
     SELECT DISTINCT @min_date, [name], 0
     FROM history_of_jerkiness ) as data
WHERE is_jerk = 0
  AND processing_date <= @run_date
GROUP BY [name]
HAVING DATEDIFF(day, MAX(processing_date), @run_date) > 0

I created the test table with the following:我使用以下内容创建了测试表:

CREATE TABLE history_of_jerkiness (processing_date datetime, [name] varchar(20), is_jerk bit)

INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Bob', 0)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Alex', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-01'}, 'Carol', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Bob', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Alex', 0)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-02'}, 'Carol', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Matt', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Bob', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Alex', 1)
INSERT INTO history_of_jerkiness (processing_date, [name], is_jerk) VALUES ({d '2009-01-03'}, 'Carol', 0) 

"This can be made simple if you structure the data so as to meet the following criteria... “如果您构建数据以满足以下标准,这可以变得简单......

All people must have an initial record where they are not a jerk"所有人都必须有一个他们不是混蛋的初始记录”

What criteria the data should and should not meet is up to the user, not to the developer.数据应该和不应该满足的标准取决于用户,而不是开发人员。

This can be made simple if you structure the data so as to meet the following criteria...如果您构建数据以满足以下标准,这可以变得简单......

All people must have an initial record where they are not a jerk所有的人都必须有一个他们不是混蛋的初始记录

You can do something like...你可以做类似...

SELECT
   name,
   MAX(date)   last_day_jerk_free
FROM
   jerkiness AS [data]
WHERE
   jerk = 'false'
   AND date <= 'a date'
GROUP BY
   name

You already know what the base date is ('a date'), and now you know the last day they were not a jerk.您已经知道基准日期是什么(“日期”),现在您知道他们不是混蛋的最后一天。 I don't know sybase but I'm sure there are commands you can use to get the number of days between 'a data' and 'last_day_jerk_free'我不知道 sybase,但我确信您可以使用一些命令来获取“a data”和“last_day_jerk_free”之间的天数

EDIT:编辑:

There are multiple ways of artificially creating an initialising "not jerky" record.有多种方法可以人为地创建初始化的“非生涩”记录。 The one suggested by Will Rickards uses a sub query containing a union. Will Rickards 建议使用包含联合的子查询。 Doing so, however, has two down sides...然而,这样做有两个不利方面......
1. The sub-query masks any indexes which may otherwise have been used 1. 子查询屏蔽了任何可能被使用的索引
2. It assumes all people have data starting from the same point 2. 假设所有人都有从同一点开始的数据

Alternatively, take Will Rickard's suggestion and move the aggregation from the outer query into the inner query (so maximising use of indexes), and union with a generalised 2nd sub query to create the starting jerky = false record...或者,采用 Will Rickard 的建议并将聚合从外部查询移动到内部查询(从而最大限度地利用索引),并与通用的第二个子查询联合以创建起始 jerky = false 记录......

SELECT name, DATEDIFF(day, MAX(processing_date), @run_date) AS days_jerky
FROM (

    SELECT name, MAX(processing_date) as processing_date
    FROM history_of_jerkiness
    WHERE is_jerk = 0 AND processing_date <= @run_date
    GROUP BY name

    UNION

    SELECT name, DATEADD(DAY, -1, MIN(processing_date))
    FROM history_of_jerkiness
    WHERE processing_date <= @run_date
    GROUP BY name

    ) as data
GROUP BY
   name

The outer query still has to do a max without indexes, but over a reduced number of records (2 per name, rather than n per name).外部查询仍然必须在没有索引的情况下执行最大值,但记录数量会减少(每个名称 2 个,而不是每个名称 n 个)。 The number of records is also reduced by not requiring every name to have a value for every date in use.由于不要求每个名称对每个使用日期都有一个值,因此也减少了记录的数量。 There are many other ways of doing this, some can be seen in my edit history.还有很多其他方法可以做到这一点,其中一些可以在我的编辑历史中看到。

How about this:这个怎么样:

select a.name,count(*) from history_of_jerkiness a
left join history_of_jerkiness b
on a.name = b.name 
and a.processing_date >= b.processing_date
and a.is_jerk = 'true'
where not exists
( select * from history_of_jerkiness c
  where a.name = c.name
  and c.processing_date between a.processing_date and b.processing_date
  and c.is_jerk = 'false'
)
and a.processing_date <= :a_certain_date;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM