[英]How to chose the previous non null value in a column for null values resulting from an outer join
Say I have a table A, which has null values in a particular column say Value as a result of an outer join.假设我有一个表 A,它在特定列中有空值,比如外连接的结果。 Now if A represents a value such as cumsum, it doesnt make sense for it to drop to null in the middle现在如果 A 代表一个值,比如 cumsum,它在中间下降到 null 是没有意义的
| ID | Value |
| --- | ----- |
| 1 | null |
| 2 | 576 |
| 3 | null |
| 4 | 695 |
| 5 | null |
so is it possible to transform the table A, to table B which might look like this那么是否可以将表 A 转换为可能看起来像这样的表 B
| ID | Value |
| --- | ----- |
| 1 | 0 |
| 2 | 576 |
| 3 | 576 |
| 4 | 695 |
| 5 | 695 |
Therefore is it possible to replace the null values in a column with either the previous non null value, or if in case that is not available a default value 0. This transformation also has to be done across all columns in a table, with 10 or so columns.因此,是否可以将列中的空值替换为先前的非空值,或者如果不可用,则使用默认值 0。此转换还必须在表中的所有列中完成,使用 10 或所以列。
Should be compatible with general SQL, definitely works on postgreSQL:应该与一般 SQL 兼容,绝对适用于 postgreSQL:
SELECT a1.id, COALESCE(a2.value, 0) FROM
(
SELECT a1.id AS id, MAX(a2.id) as idref FROM a a1 LEFT JOIN a a2
ON a2.id <= a1.id AND a2.value IS NOT NULL GROUP BY a1.id
) AS a1
LEFT JOIN a a2 ON a2.id = a1.idref
Double join required – doesn't appear optimal to me, but at least works .需要双重加入 - 对我来说似乎不是最佳选择,但至少有效。
Concerning your comment: The pattern could be extended, but you'd need to add yet another two joins for each further column if the respective existing values might come from different rows.关于您的评论:该模式可以扩展,但如果相应的现有值可能来自不同的行,则您需要为每一列再添加两个连接。
Edit:编辑:
While still requiring two sub-queries (instead of joins), this variant seems to perform slightly better on postgre SQL:虽然仍然需要两个子查询(而不是连接),但这个变体似乎在 postgre SQL 上表现得稍微好一些:
SELECT
a1.id,
COALESCE
(
(
SELECT a2.value FROM a a2 WHERE a2.id =
(SELECT MAX(a3.id) FROM a a3 WHERE a3.id <= a1.id AND a3.value IS NOT NULL)
),
0
)
FROM a a1
You'll still need a set of two sub-queries for every additional column (unless all data comes can be retrieved from the same row), though performance gain likely will accumulate then.对于每个额外的列,您仍然需要一组两个子查询(除非所有数据都可以从同一行中检索到),尽管性能提升可能会在那时累积。
WITH cte AS (
SELECT
id,
value,
count(value) OVER (ORDER BY id)
FROM (
VALUES (1, NULL),
(2, 576),
(3, NULL),
(4, 695),
(5, NULL),
(6, NULL)) g (id, value) ORDER BY id
)
SELECT
id,
value,
coalesce(first_value(value) OVER (PARTITION BY count ORDER BY id), 0)
FROM
cte
ORDER BY
id;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.