简体   繁体   English

如何使用 Spark SQL 选择计数而不会出错?

[英]How do I Select Counts with Spark SQL Without Getting Errors?

I am trying to do a very simple select statement to count the amount of iPod data that is null in my table in spark.我正在尝试执行一个非常简单的 select 语句来计算 spark 表中为空的 iPod 数据量。 My table looks like this我的桌子看起来像这样

-----+------+------+------+----+-----+
| Time|Period|iPhone|  iPad|iPod|  Mac|
+-----+------+------+------+----+-----+
|Q4/98|     1|  null|  null|null|0.944|
...

The command:命令:

apl_df.select("count(iPod) from apl_tbl where iPod is null")

Gives: org.给:组织。 apache.spark.sql.AnalysisException: cannot resolve '`count(iPod) from apl_tbl where iPod is null`' given input columns: [iPhone, iPod, Mac, Period, iPad, Time];;

And

apl_df.selectExpr("count(iPod) from apl_tbl where iPod is null")

Gives: org.apache.spark.sql.catalyst.parser.ParseException:给出: org.apache.spark.sql.catalyst.parser.ParseException:

Please help me fix this issue and understand the meaning of these errors.请帮我解决这个问题并理解这些错误的含义。

Try this:尝试这个:

apl_df.select("iPod").filter("iPod is null").count()

Or, if you want to use more familiar sql syntax, you can try或者,如果你想使用更熟悉的sql语法,你可以尝试

apl_df.createOrReplaceTempView("apl_tbl")

spark.sql("select count(iPod) from apl_tbl where iPod is Null")

See reference: https://spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/DataFrame.html#selectExpr(scala.collection.Seq)见参考: https : //spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/DataFrame.html#selectExpr(scala.collection.Seq)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM