I want a generic query to get fill rate of all columns in table.Query should work irrespective of the column number.I have to implement this using presto sql.I have tried searching for a method but nothing seems to working.
Input
A | B | C | D |
---|---|---|---|
1 | null | null | 1 |
2 | 2 | 3 | 4 |
Null | Null | Null | 5 |
Output
A | B | C | D |
---|---|---|---|
0.66 | 0.33 | 0.33 | 1.0 |
Explanation: A Col contains 3 rows with 2 non null values so 2/3 B and C Cols contain 2 null value and one non null value so 1/3 D col there is no null values so 3/3
Thanks in advance
AFAIK Presto/Trino does not provide dynamic query execution capabilities (ie something like EXEC in T-SQL) so the only option (unless you are ready to go down user defined function road) to write a query which will enumerate all needed columns (if you are using client from another language - you can build the query dynamically leveraging information_schema.columns
info):
with dataset(A, B, C, D) as (
values (1, null, null, 1),
(2, 2, 3, 4),
(Null, Null, Null, 5)
)
select 1.00 * count_if(a is not null) / count(*) a,
1.00 * count_if(b is not null) / count(*) b,
1.00 * count_if(c is not null) / count(*) c,
1.00 * count_if(d is not null) / count(*) d
from dataset;
Output:
a | b | c | d |
---|---|---|---|
0.67 | 0.33 | 0.33 | 1.00 |
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.