[英]How do I Select Counts with Spark SQL Without Getting Errors?
I am trying to do a very simple select statement to count the amount of iPod data that is null in my table in spark.我正在尝试执行一个非常简单的 select 语句来计算 spark 表中为空的 iPod 数据量。 My table looks like this我的桌子看起来像这样
-----+------+------+------+----+-----+
| Time|Period|iPhone| iPad|iPod| Mac|
+-----+------+------+------+----+-----+
|Q4/98| 1| null| null|null|0.944|
...
The command:命令:
apl_df.select("count(iPod) from apl_tbl where iPod is null")
Gives: org.给:组织。 apache.spark.sql.AnalysisException: cannot resolve '`count(iPod) from apl_tbl where iPod is null`' given input columns: [iPhone, iPod, Mac, Period, iPad, Time];;
And和
apl_df.selectExpr("count(iPod) from apl_tbl where iPod is null")
Gives: org.apache.spark.sql.catalyst.parser.ParseException:
给出: org.apache.spark.sql.catalyst.parser.ParseException:
Please help me fix this issue and understand the meaning of these errors.请帮我解决这个问题并理解这些错误的含义。
Try this:尝试这个:
apl_df.select("iPod").filter("iPod is null").count()
Or, if you want to use more familiar sql syntax, you can try或者,如果你想使用更熟悉的sql语法,你可以尝试
apl_df.createOrReplaceTempView("apl_tbl")
spark.sql("select count(iPod) from apl_tbl where iPod is Null")
See reference: https://spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/DataFrame.html#selectExpr(scala.collection.Seq)见参考: https : //spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/DataFrame.html#selectExpr(scala.collection.Seq)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.