I have a sample dataset such as following:
Id Name ReferredBy
1 John Doe NULL
2 Jane Smith NULL
3 Anne Jenkins 2
4 Eric Branford NULL
5 Pat Richards 1
6 Alice Barnes 2
If I want to select all recorded not referred by Jane Smith I would use the following command:
SELECT Name FROM Customers WHERE ReferredBy <> 2;
On SQL Server, this will exclude NULL values so I need to write it in the following way:
SELECT Name FROM Customers WHERE ReferredBy IS NULL OR ReferredBy <> 2
Does HiveQL have the same issue?
*It is hard to test it out on the raw dataset I have since it is quiet large with very few missings.
Thanks!
The behavior of NULL
is defined by SQL and all databases respect it. That said, the standard also specifies NULL
safe comparison operators, IS NOT DISTINCT FROM
and IS DISTINCT FROM
. Hive supports one for equality, but not that one.
For your logic, you can use this Hive extension for <=>
:
where not (ReferredBy <=> 2)
The <=>
is the NULL
-safe comparison, so it returns "true" for NULL <=> NULL
and "false" for NULL <=> 2
, instead of NULL
in both cases. This is presumably borrowed from MySQL.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.