I need to a check a few conditions, so I filtered my RDD this way:
scala> file.filter(r => r(38)=="0").filter(r => r(2)=="0").filter(r => r(3)=="0").count
Is it correct as an alternative of "&&"?
Yes, a series of filters is semantically equivalent to one filter with &&
in your case.
file.filter(r => r(38) == "0" && r(2) == "0" && r(3) == "0")
However, the variant above is guaranteed to be faster than the earlier version. This can be established via the following:
&&
is a short circuit operator, and the next comparison happens only if the first one evaluates to true
. The number of comparisons in both the cases will be the same (yes!).
The multiple filter version involves three passes over the RDD vs. one pass for a single filter with &&
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.