简体   繁体   English

删除顺序数据中的NULL-MYSQL

[英]Removing NULLs in sequential data - MYSQL

I have a database for tracking claims payments. 我有一个用于跟踪索赔付款的数据库。 There's a table for claims claim , a table for monthly payments claim_month and a table defining each month month . 有一个表,索赔claim ,按月付款表claim_month每个月定义了一个表month month has each entry in order so that if month_id[1] > month_id[2] then the second figure is earlier than the first figure. month具有每个条目的顺序,因此,如果month_id[1] > month_id[2]则第二个数字早于第一个数字。

Using the query (the randomisation of paid_to_date is added for privacy purposes): 使用查询(为隐私目的添加了paid_to_date的随机化):

SELECT
claim.claim_id,
m.month_id,
claim_month_id,
IF (claim_month.paid_to_date IS NOT NULL, ROUND(RAND(1) * 100), NULL) AS paid_to_date
FROM
    claim
    INNER JOIN ( SELECT DISTINCT month_id FROM claim_month ) AS m
    LEFT JOIN claim_month ON claim.claim_id = claim_month.claim_id 
    AND m.month_id = claim_month.month_id

I get the following data. 我得到以下数据。

INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (25, 1004, 8584, 41);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (25, 1005, NULL, NULL);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (25, 1006, NULL, NULL);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (25, 1007, NULL, NULL);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (21, 1004, 8580, 87);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (21, 1005, NULL, NULL);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (21, 1006, NULL, NULL);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (21, 1007, NULL, NULL);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (5, 1004, 8564, 14);
INSERT INTO ``(`claim_id`, `month_id`, `claim_month_id`, `paid_to_date`) VALUES (5, 1005, 8627, 9);

数据可视化

From here, I need to replace NULLs with the latest non-null observation for each claim_id . 从这里开始,我需要为每个claim_id用最新的非空观察值替换NULL。

  • Since I'm using MariaDB/MYSQL, the LAG function doesn't allow for ignoring NULLs which is unfortunate since it appears to be perfect. 由于我使用的是MariaDB / MYSQL,因此LAG函数不允许忽略NULL,这是不幸的,因为它看起来很完美。

  • I've also looked into using COALESCE and partitioning it, but that doesn't seem to be allowed either. 我也研究过使用COALESCE对其进行分区,但是似乎也不被允许。

  • I've also looked into using user defined functions however I'm using multiple data types and can't seem to work out how to define a function that doesn't require setting the output data type. 我也研究了使用用户定义的函数,但是我使用的是多种数据类型,似乎无法弄清楚如何定义不需要设置输出数据类型的函数。

I've spent the whole morning looking through previous questions however most of them are for PostgresSQL which isn't particularly helpful in this context. 我整个上午都在浏览以前的问题,但是大多数问题都是针对PostgresSQL的,在这种情况下并不是特别有用。 What am I missing? 我想念什么?

I've worked out a solution, but I'm not convinced it's the best. 我已经找到了解决方案,但是我不认为这是最好的。 I suspect that for larger databases, this would be quite demanding. 我怀疑对于较大的数据库,这会要求很高。 It works in the meantime however. 但是,它同时起作用。

I've essentially joined the table onto itself repeatedly wherever a record is earlier and on the same claim using something similar to the following: 实际上,无论记录何时出现在更早的位置,并且出于相同的主张,我都使用以下类似的方法将表反复加入到表中:

SELECT 
    b.claim_id,
    b.month_id,
    b.claim_month_id,
    claim_month.claim_month_id AS claim_month_id_latest

FROM

(SELECT
    a.claim_id,
    a.month_id,
    a.claim_month_id,
    MAX(claim_month.month_id) AS source_month_id

FROM
    (
    SELECT
        claim.claim_id,
        m.month_id,
        claim_month_id
    FROM
        claim
        INNER JOIN ( SELECT DISTINCT month_id FROM claim_month ) AS m
        LEFT JOIN claim_month ON claim.claim_id = claim_month.claim_id 
        AND m.month_id = claim_month.month_id 

    ) AS a
    LEFT JOIN claim_month ON a.claim_id = claim_month.claim_id 
                                                AND a.month_id >= claim_month.month_id

GROUP BY
    a.claim_id, a.month_id) AS b
    LEFT JOIN claim_month ON b.claim_id = claim_month.claim_id AND b.source_month_id = claim_month.month_id


ORDER BY b.claim_id, b.month_id 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM