Assuming that I have the following piece of code in the SELECT clause which is being executed on Spark:
...
MEAN(CASE
WHEN (col1 = 'A'
AND (col3 = 'A' OR col4 = 'B')) THEN col2
END) AS testing,
...
What would be the output of this query when col2
is NULL
? Are the rows containing col2=NULL
be ignored by the MEAN
function?
The result will be NULL
. It will have the type of col2
-- this might matter in some databases (or if you are saving the result to a table).
What is the MEAN()
function? To calculate the average, use AVG()
. This is the standard function for calculating averages in SQL.
Disclaimer - don't know Apache Spark!
I've created a SQL Fiddle - http://sqlfiddle.com/#!9/6f7d5e/3 .
If col2 is null, it is not included in the average, unless all the matching records are null.
I believe you meant AVG
. It will ignore NULL values. So if the result of case expression is:
100
200
300
NULL
Then the result would be (100 + 200 + 300) / 3 = 200
instead of (100 + 200 + 300) / 4
. And in case the result of the case expression is:
NULL
NULL
NULL
NULL
Then the result will be NULL instead of 0.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.