[英]Apache Spark DataFrame: df.where() with Java:List attribute
imagine you have a df like this: 假设您有这样的df:
a b
1 1
1 2
1 3
2 1
2 2
2 3
and you want to implement a generic .where functionnality; 并且您想要实现通用的.where功能; how can you filter by a List
如何按列表过滤
val l1:List[Int] = List (1,2)
df.where($"b" === l1:_*) // does not work
or is there even a option, where you can ask sth like this: 甚至还有一个选择,您可以在其中提出如下要求:
df.where($"a" === l1:_* && $"b" === l1:_*)
If I got you right, you want IN semantics: 如果我说对了,那么您需要IN语义:
df.where($"b" isin (l1: _*)).show()
+---+---+
| a| b|
+---+---+
| 1| 1|
| 1| 2|
| 2| 1|
| 2| 2|
+---+---+
And 和
df.where(($"a" isin (l1: _*)) and ($"b" isin (l1: _*))).show()
+---+---+
| a| b|
+---+---+
| 1| 1|
| 1| 2|
| 2| 1|
| 2| 2|
+---+---+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.