[英]Spark isin with multiple list
我想讓我們使用不同的列表。
val STATUT_ID_OK : List[String] = List("103","104","613")
val STATUT_ID_KO : List[String] = List("106","546","609","17")
val STATUT_ID_KO_AND_OK = STATUT_ID_OK :: STATUT_ID_KO
但是當我嘗試使用STATUT_ID_KO_AND_OK
我得到這個錯誤:
Unsupported literal type class scala.collection.immutable.$colon$colon List(103, 104, 613)
java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(103, 603, 613)
at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:78)
at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:164)
at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:164)
at scala.util.Try.getOrElse(Try.scala:79)
...
我的代碼:
col("mycol").isin(STATUT_ID_KO_AND_OK :_*)
我嘗試了不同的事情但沒有成功:
col("mycol").isin(STATUT_ID_OK :_*,STATUT_ID_KO :_* )
col("mycol").isInCollection(STATUT_ID_KO_AND_OK)
如果要檢查列值是否存在於列表中,則應使用isInCollection
或isin
,如下所示
val STATUT_ID_OK : List[String] = List("40","104","613")
val STATUT_ID_KO : List[String] = List("106","546","30","17")
val STATUT_ID_KO_AND_OK:List[String] = STATUT_ID_OK ::: STATUT_ID_KO
df.withColumn("newColName", $"colName".isInCollection(STATUT_ID_KO_AND_OK))
//or
df1.withColumn("new", $"col".isin(STATUT_ID_KO_AND_OK: _*))
如果你想合並兩個列表,也可以使用:::
。
您需要合並(而不是附加)兩個列表,如此處所述,如下所示,使用:::
val STATUT_ID_OK : List[String] = List("103","104","613")
val STATUT_ID_KO : List[String] = List("106","546","609","17")
val STATUT_ID_KO_AND_OK: List[String] = STATUT_ID_OK ::: STATUT_ID_KO
val df = Seq(("100"), ("101"), ("102"), ("103")).toDF("mycol")
df.where(col("mycol").isin(STATUT_ID_KO_AND_OK: _*)).show(false)
// returns
+-----+
|mycol|
+-----+
|103 |
+-----+
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.