简体   繁体   中英

Apache Spark DataFrame: df.where() with Java:List attribute

imagine you have a df like this:

a b  
1 1  
1 2  
1 3  
2 1  
2 2  
2 3  

and you want to implement a generic .where functionnality; how can you filter by a List

val l1:List[Int] = List (1,2)  
df.where($"b" === l1:_*) // does not work

or is there even a option, where you can ask sth like this:

df.where($"a" === l1:_* && $"b" === l1:_*)

If I got you right, you want IN semantics:

df.where($"b" isin (l1: _*)).show()
+---+---+ 
|  a|  b| 
+---+---+ 
|  1|  1| 
|  1|  2| 
|  2|  1| 
|  2|  2| 
+---+---+ 

And

df.where(($"a" isin (l1: _*)) and ($"b" isin (l1: _*))).show()
+---+---+ 
|  a|  b| 
+---+---+ 
|  1|  1| 
|  1|  2| 
|  2|  1| 
|  2|  2| 
+---+---+ 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM