简体   繁体   English

在Spark中计算Hive表的统计信息

[英]To Compute statistics of Hive table in Spark

I have created a DataFrame to load CSV files and created a temp table to get the column statistics. 我创建了一个DataFrame来加载CSV文件,并创建了一个临时表来获取列统计信息。

However when I try to run the ANALYZE command I am facing the below error The same Analyze command ran in Hive successfully. 但是,当我尝试运行ANALYZE命令时,我面临以下错误。相同的Analyze命令成功运行在Hive中。

Spark Version : 1.6.3 Spark版本:1.6.3

df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true") 
.option("mode", "DROPMALFORMED")
.load("/bn_data/bopis/*.csv")

// To get the statistics of columns
df.registerTempTable("bopis")

val stat=sqlContext.sql("analyze table bopis compute statistics for columns").show()

Error: 错误:

    java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier analyze found

analyze table bopis compute statistics for columns
^

Please let us know on how to achieve the column statistics using Spark 请告诉我们如何使用Spark实现列统计信息

Thanks.! 谢谢。!

If you use the FOR COLUMNS option, you have to pass a list of column names, see https://docs.databricks.com/spark/latest/spark-sql/language-manual/analyze-table.html 如果使用FOR COLUMNS选项,则必须传递列名列表,请参阅https://docs.databricks.com/spark/latest/spark-sql/language-manual/analyze-table.html

In any case, even if you do, you are going to get an error because you can't run compute statistics on a temp table. 在任何情况下,即使您这样做,也会出现错误,因为您无法在临时表上运行计算统计信息。 ( you will get a Table or view 'bopis' not found in database 'default' ). (你会Table or view 'bopis' not found in database 'default'一个Table or view 'bopis' not found in database 'default' )。

You'll have to create a full blown Hive table, either via df.write.saveAsTable("bopis_hive") , or sqlContext.sql("CREATE TABLE bopis_hive as SELECT * from bopis") 您必须通过df.write.saveAsTable("bopis_hive")sqlContext.sql("CREATE TABLE bopis_hive as SELECT * from bopis")创建一个完整的Hive表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM