[英]Using Scala lists in Spark SQL queries
我正在嘗試運行以下查詢:
val IgnoreList = List(""," ","0","-","{}","()","[]","null","Null","NULL","false","False","FALSE","NA","na","Na","n/a","N/a","N/A","nil","Nil","NIL")
val df = sqlContext.sql(s"select userName from names where userName not in $IgnoreList")
但這是行不通的。 我也嘗試過:
val IgnoreList = List(""," ","0","-","{}","()","[]","null","Null","NULL","false","False","FALSE","NA","na","Na","n/a","N/a","N/A","nil","Nil","NIL")
sqlContext.udf.register("SqlList",(s: List[String]) => "('" + s.mkString("','") + "')")
val df = sqlContext.sql(s"select userName from names where userName not in SqlList($IgnoreList)")
但這也不起作用。 有什么建議么?
您的第一次嘗試失敗,因為它調用了List的默認toString,它不會返回所需的SQL合法語法。 第二次嘗試失敗,因為使用UDF構建SQL字符串沒有意義-UDF將應用於記錄 (或列),而不是創建字符串查詢。
您需要在第二步完成格式化,並在第一步完成普通的字符串插值:
val IgnoreList = List(""," ","0","-","{}","()","[]","null","Null","NULL","false","False","FALSE","NA","na","Na","n/a","N/a","N/A","nil","Nil","NIL")
val condition = "('" + IgnoreList.mkString("','") + "')"
val df = sqlContext.sql(s"select userName from names where userName not in $condition")
順便說一句,以這種方式格式化列表可能會更清楚:
IgnoreList.map(s => s"'$s'").mkString(",")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.