Here is the configuration I am starting with:
DROP TABLE ruleset1;
CREATE TABLE ruleset1 (id int not null unique,score_rule1 float default 0.0,score_rule2 float default 0.0,score_rule3 float default 0.0);
DROP TABLE ruleset2;
CREATE TABLE ruleset2 (id int not null unique,score_rule1 float default 0.0,score_rule2 float default 0.0,score_rule3 float default 0.0);
insert into ruleset1 (id, score_rule1, score_rule2, score_rule3) values (0,0.8,0,0);
insert into ruleset1 (id, score_rule1, score_rule2, score_rule3) values (1,0,0.1,0);
insert into ruleset2 (id, score_rule1, score_rule2, score_rule3) values (0,0,0,0.3);
insert into ruleset2 (id, score_rule1, score_rule2, score_rule3) values (2,0,0.2,0);
what I have is this now is 2 tables
ruleset1:
| ID | SCORE_RULE1 | SCORE_RULE2 | SCORE_RULE3
================================================
| 0 | 0.8 | 0 | 0
| 1 | 0 | 0.1 | 0
and ruleset2:
| ID | SCORE_RULE1 | SCORE_RULE2 | SCORE_RULE3
================================================
| 0 | 0 | 0 | 0.3
| 2 | 0 | 0.2 | 0
and I want to outer join them and calculate the mean of non zero columns, like this:
| ID | Average
================
| 0 | 0.55
| 1 | 0.1
| 2 | 0.2
My current query is:
select * from ruleset1 full outer join ruleset2 on ruleset1.id = ruleset2.id;
which gives an ugly result:
| ID | SCORE_RULE1 | SCORE_RULE2 | SCORE_RULE3 | ID | SCORE_RULE1 | SCORE_RULE2 | SCORE_RULE3
============================================================================================
| 0 | .8 | 0 | 0 | 0 | 0 | 0 | .3
| - | - | - | - | 2 | 0 | .2 | 0
| 1 | 0 | .1 | 0 | - | - | - | -
Can anyone help with a better query please?
Thank you very much!
Of course avg
doesn't ignore zeroes, only NULLs, thus NULLIF(column, 0)
could be used.
But as you got denormalized data you can simply normalize it on-the-fly:
select id, avg(score)
from
(
select id, score_rule1 score
from ruleset1 where score_rule1 <> 0
union all
select id, score_rule2 from ruleset1 where score_rule2 <> 0
union all
select id, score_rule3 from ruleset1 where score_rule3 <> 0
union all
select id, score_rule1 from ruleset2 where score_rule1 <> 0
union all
select id, score_rule2 from ruleset2 where score_rule2 <> 0
union all
select id, score_rule3 from ruleset2 where score_rule3 <> 0
) dt
group by id;
To avoid five Unions you could use only one and do some additional logic:
select id, sum(score) / sum(score_count)
from
(
select id, score_rule1 + score_rule2 + score_rule3 score,
case when score_rule1 = 0 then 0 else 1 end +
case when score_rule2 = 0 then 0 else 1 end +
case when score_rule3 = 0 then 0 else 1 end score_count
from ruleset1
union all
select id, score_rule1 + score_rule2 + score_rule3 score,
case when score_rule1 = 0 then 0 else 1 end +
case when score_rule2 = 0 then 0 else 1 end +
case when score_rule3 = 0 then 0 else 1 end score_count
from ruleset2
) dt
group by id;
This assumes there are no NULLs in the core_rule columns.
Here's an example with PostgreSQL that you could adapt with Oracle (sorry, SQLFiddle's Oracle isn't cooperating). Thanks to Juan Carlos Oropeza's suggestion, the code below runs on Oracle well: http://rextester.com/DVP59353
select
r.id,
sum(coalesce(r1.score_rule1,0) +
coalesce(r1.score_rule2,0) +
coalesce(r1.score_rule3,0) +
coalesce(r2.score_rule1,0) +
coalesce(r2.score_rule2,0) +
coalesce(r2.score_rule3,0)
)
/
sum(case when coalesce(r1.score_rule1,0) <> 0 then 1 else 0 end +
case when coalesce(r1.score_rule2,0) <> 0 then 1 else 0 end +
case when coalesce(r1.score_rule3,0) <> 0 then 1 else 0 end +
case when coalesce(r2.score_rule1,0) <> 0 then 1 else 0 end +
case when coalesce(r2.score_rule2,0) <> 0 then 1 else 0 end +
case when coalesce(r2.score_rule3,0) <> 0 then 1 else 0 end) as Average
from
(select id from ruleset1
union
select id from ruleset2) r
left join ruleset1 r1 on r.id = r1.id
left join ruleset2 r2 on r.id = r2.id
group by r.id
SQLFiddle with PostgreSQL version is here: http://sqlfiddle.com/#!15/24e3f/1 .
This example combines id from both tables using a union
. Doing so allows the same ID in both ruleset1 and ruleset2 to appear just once in the result. r
is an alias given to this generated table.
All the id
s are then left joined with both tables. During the summation process, it is possible that the NULL values resulting from left join may impact the result. So the NULLs are coalesced to zero in the math.
dnoeth is the easy and clean answer.
here I was just playing with COALESCE
and NVL2
select COALESCE(r.ID, s.ID),
COALESCE(r.score_rule1, 0) +
COALESCE(r.score_rule2, 0) +
COALESCE(r.score_rule3, 0) +
COALESCE(s.score_rule1, 0) +
COALESCE(s.score_rule2, 0) +
COALESCE(s.score_rule3, 0) as sum,
NVL2(r.score_rule1, 0, 1) +
NVL2(r.score_rule2, 0, 1) +
NVL2(r.score_rule3, 0, 1) +
NVL2(s.score_rule1, 0, 1) +
NVL2(s.score_rule2, 0, 1) +
NVL2(s.score_rule3, 0, 1) as tot
from ruleset1 r
full outer join ruleset2 s
on ruleset1.id = ruleset2.id;
Then your avg is sum/tot
union all
your two tables, unpivot, change the zeros into null with nullif
, and use standard avg()
aggregate function:
select id, avg(nullif(value, 0)) as avg_value from (
select * from ruleset1
union all
select * from ruleset2
)
unpivot ( value for column_name in (score_rule1, score_rule2, score_rule3))
group by id
order by id
;
ID AVG_VALUE
---------- ----------
0 .55
1 .1
2 .2
SELECT s.id, AVG(s.score)
FROM(
SELECT id,score_rule1+score_rule2+score_rule3 as score
FROM ruleset2
UNION ALL
SELECT id,(score_rule1+score_rule2+score_rule3) as score
FROM ruleset1) s
group by s.id
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.