[英]Postgresql: LAG until a row with a certain value is found, and return that value
I have a dataset containing emails from customers. 我有一个包含来自客户的电子邮件的数据集。
One of the columns is Type. 其中一列是Type。 In Type exists a value 'Duplicate Case' which flags that a customer has just sent us a barrage of emails about the same topic.
在Type中存在一个值'Duplicate Case',它标记客户刚刚向我们发送了关于同一主题的大量电子邮件。 We only reply to the original and close all other cases as duplicates.
我们只回复原文,并将所有其他案例作为重复内容关闭。 But what I want to do, is get the type of the original email.
但我想做的是获取原始电子邮件的类型。
I want to be able to create the column Original Type: 我希望能够创建列原始类型:
There generally aren't more than 5 duplicates of a cases per customers.I would like to add logic which only returns results where duplicate cases are 24 hours older than the original 每个客户的案例一般不超过5个。我想添加逻辑,只返回结果,其中重复案例比原始案例早24小时
I have this awful piece of code: 我有这段糟糕的代码:
CASE
WHEN type = 'Duplicate Case'
AND LAG(type,4) OVER (PARTITION BY c.client_code ORDER BY case_number ASC) = 'Duplicate Case'
THEN LAG(type,5) OVER (PARTITION BY c.client_code ORDER BY case_number ASC)
WHEN type = 'Duplicate Case'
AND LAG(type,3) OVER (PARTITION BY c.client_code ORDER BY case_number ASC) = 'Duplicate Case'
THEN LAG(type,4) OVER (PARTITION BY c.client_code ORDER BY case_number ASC)
WHEN type = 'Duplicate Case'
AND LAG(type,2) OVER (PARTITION BY c.client_code ORDER BY case_number ASC) = 'Duplicate Case'
THEN LAG(type,3) OVER (PARTITION BY c.client_code ORDER BY case_number ASC)
WHEN LAG(type) OVER (PARTITION BY c.client_code ORDER BY case_number ASC) = 'Duplicate Case'
AND type = 'Duplicate Case'
THEN LAG(type,2) OVER (PARTITION BY c.client_code ORDER BY case_number ASC)
WHEN type = 'Duplicate Case'
THEN LAG(type) OVER (PARTITION BY c.client_code ORDER BY case_number ASC)
END AS original_type
And this gives me sort of what I want: 这给了我一些我想要的东西:
But how can I add a time logic? 但是我怎样才能添加时间逻辑呢? I want to put in Type only from the first non-duplicate if this duplicate was created less than 24 hours before the original type?
如果在原始类型之前不到24小时创建此副本,我想从第一个非重复项中输入Type?
So what you need to do here is create a temporary table with a blank field, then update this blank field later on with some criteria: 所以你需要做的是创建一个带有空白字段的临时表,然后用一些标准更新这个空白字段:
/* Main table */
DROP TABLE IF EXISTS cases;
CREATE TEMPORARY TABLE cases AS
SELECT
created_date,
type,
CASE WHEN type = 'Duplicate Case'
THEN CAST('Unknown' AS VARCHAR(40))
ELSE type END AS original_type,
CAST('' AS VARCHAR(40)) AS original_case_number,
client_code,
case_number
FROM case
ORDER BY c.client_code DESC, case_number DESC;
/* Append previous case data */
UPDATE cases
SET original_case_number = prev_case_number
FROM
(
SELECT
a.case_number,
MAX(b.case_number) AS prev_case_number
FROM cases a
LEFT JOIN cases b ON a.client_code = b.client_code
AND b.created_date BETWEEN a.created_date - INTERVAL '48 hour' AND
a.created_date + INTERVAL '1 second'
AND b.type <> 'Duplicate Case'
AND a.type = 'Duplicate Case'
GROUP BY 1
) prev
WHERE cases.case_number = prev.case_number;
UPDATE cases
SET original_type = b.type
FROM (SELECT *
FROM cases) b
WHERE cases.original_case_number = b.case_number;
SELECT * FROM cases SELECT * FROM个案
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.