简体   繁体   English

Apache Spark SQL从SQL查询获取数据帧中的值

[英]Apache Spark SQL get values in dataframe from SQL query

I'm trying to get a string value from a SQL query with Apache Spark 2.2.0 as follows: 我正在尝试使用Apache Spark 2.2.0从SQL查询中获取字符串值,如下所示:

val result = spark.sql("SELECT AnswerText FROM datatable WHERE participantUUID='010A0550' AND assessmentNumber=0 AND Q_id_string = '1_Age'")

assertResult("23") {
  result.collect.head.getString(0)
}

I get the following exception: 我得到以下异常:

next on empty iterator
java.util.NoSuchElementException: next on empty iterator

I've tried collectAsList to return a row but not getting any joy from that, either. 我尝试了collectAsList返回一行,但也没有从中获得任何乐趣。 I simply want to return the actual value from the query in the DataFrame, not the column, row or field. 我只是想从DataFrame中的查询中返回实际值,而不是列,行或字段。 In this case, the result is a string but it could also be an int - the age of the person = 23. 在这种情况下,结果是一个字符串,但也可能是一个整数-人的年龄= 23。

This happens probably because query doesn't return any items. 发生这种情况的原因可能是查询没有返回任何项目。 It would be better to use headOption 最好使用headOption

assertResult(Some("23")) {
  result.take(1).headOption.map(_.getAs[String]("AnswerText"))
}

or push it to SQL: 或将其推送到SQL:

assertResult(1) {
  spark
    .sql("""SELECT AnswerText 
            FROM datatable 
            WHERE participantUUID='010A0550' AND 
                  assessmentNumber=0 AND
                  Q_id_string = '1_Age'""")
   .where($"AnswerText" === "23").count 
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM