[英]Impute all null values with most frequent values of corresponding columns in oracle sql with single update statement
I'm trying to impute all the null values present in a oracle database table.我正在尝试估算 oracle 数据库表中存在的所有 null 值。 Suppose the table contains the following rows
假设表包含以下行
ID Col1 Col2
-------------------
1 Male USA
2 Male USA
3 Female Russia
4 (null) USA
5 Male (null)
6 Male USA
7 Female USA
8 (null) Canada
9 Male USA
Now, we can see "Male" is the most frequent value in Col1 and "USA" is the most frequent value in Col2.现在,我们可以看到“Male”是 Col1 中最常见的值,“USA”是 Col2 中最常见的值。 I want all the null values in Col1 to be replaced by "Male" and all the null values in Col2 to be replace by "USA".
我希望将 Col1 中的所有 null 值替换为“男性”,并将 Col2 中的所有 null 值替换为“美国”。 In case of a tie any value can be used to replace.
在平局的情况下,任何值都可以用来替换。
So, the final table will look like this.因此,决赛桌将如下所示。
ID Col1 Col2
-------------------
1 Male USA
2 Male USA
3 Female Russia
4 Male USA
5 Male USA
6 Male USA
7 Female USA
8 Male Canada
9 Male USA
So far what I've done is this.到目前为止,我所做的就是这样。
UPDATE tablename
SET
col1 = (
SELECT
col1
FROM
tablename
GROUP BY
col1
ORDER BY
COUNT(*) DESC
FETCH FIRST 1 ROWS ONLY
)
WHERE
col1 IS NULL;
UPDATE tablename
SET
col2 = (
SELECT
col2
FROM
tablename
GROUP BY
col2
ORDER BY
COUNT(*) DESC
FETCH FIRST 1 ROWS ONLY
)
WHERE
col2 IS NULL;
What I've done here is finding most frequent value for every column and update it.我在这里所做的是为每一列找到最常见的值并更新它。 Obviously this works fine for a table with only 2 columns.
显然,这适用于只有 2 列的表。 But if I have a table with more than 20 columns this process becomes messy.
但是如果我有一个超过 20 列的表,这个过程就会变得混乱。 Is there a better way to do this?
有一个更好的方法吗?
Compute stats_mode
for each column in a separate query, and nvl
over a cross join.为单独查询中的每一列计算
stats_mode
,并通过交叉连接计算nvl
。 Like this:像这样:
with
inputs (id, col1, col2) as (
select 1, 'Male' , 'USA' from dual union all
select 2, 'Male' , 'USA' from dual union all
select 3, 'Female', 'Russia' from dual union all
select 4, null , 'USA' from dual union all
select 5, 'Male' , (null) from dual union all
select 6, 'Male' , 'USA' from dual union all
select 7, 'Female', 'USA' from dual union all
select 8, null , 'Canada' from dual union all
select 9, 'Male' , 'USA' from dual
)
select i.id,
nvl(i.col1, m.col1_mode) as col1,
nvl(i.col2, m.col2_mode) as col2
from inputs i cross join
(select stats_mode(col1) as col1_mode,
stats_mode(col2) as col2_mode from inputs) m
;
ID COL1 COL2
---------- ------ ------
1 Male USA
2 Male USA
3 Female Russia
4 Male USA
5 Male USA
6 Male USA
7 Female USA
8 Male Canada
9 Male USA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.