簡體   English   中英

Spark isin 與多個列表

[英]Spark isin with multiple list

我想讓我們使用不同的列表。

  val STATUT_ID_OK : List[String] = List("103","104","613")
  val STATUT_ID_KO : List[String] = List("106","546","609","17")
  val STATUT_ID_KO_AND_OK = STATUT_ID_OK  :: STATUT_ID_KO 

但是當我嘗試使用STATUT_ID_KO_AND_OK我得到這個錯誤:

Unsupported literal type class scala.collection.immutable.$colon$colon List(103, 104, 613)
java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(103, 603, 613)
    at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:78)
    at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:164)
    at org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$create$2.apply(literals.scala:164)
    at scala.util.Try.getOrElse(Try.scala:79)
    ...

我的代碼:

col("mycol").isin(STATUT_ID_KO_AND_OK :_*)

我嘗試了不同的事情但沒有成功:

col("mycol").isin(STATUT_ID_OK :_*,STATUT_ID_KO :_* )
col("mycol").isInCollection(STATUT_ID_KO_AND_OK)

如果要檢查列值是否存在於列表中,則應使用isInCollectionisin ,如下所示

val STATUT_ID_OK : List[String] = List("40","104","613")
val STATUT_ID_KO : List[String] = List("106","546","30","17")

val STATUT_ID_KO_AND_OK:List[String] = STATUT_ID_OK  ::: STATUT_ID_KO

df.withColumn("newColName", $"colName".isInCollection(STATUT_ID_KO_AND_OK))

//or
df1.withColumn("new", $"col".isin(STATUT_ID_KO_AND_OK: _*))

如果你想合並兩個列表,也可以使用:::

您需要合並(而不是附加)兩個列表,如此處所述,如下所示,使用:::

val STATUT_ID_OK : List[String] = List("103","104","613")
val STATUT_ID_KO : List[String] = List("106","546","609","17")
val STATUT_ID_KO_AND_OK: List[String] = STATUT_ID_OK  ::: STATUT_ID_KO

val df = Seq(("100"), ("101"), ("102"), ("103")).toDF("mycol")

df.where(col("mycol").isin(STATUT_ID_KO_AND_OK: _*)).show(false)

// returns
+-----+
|mycol|
+-----+
|103  |
+-----+

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM