简体   繁体   English

INSERT SPARK DATAFRAME INTO HIVE Managed Acid Table 不工作,HDP 3.0

[英]INSERT SPARK DATAFRAME INTO HIVE Managed Acid Table not working, HDP 3.0

I have a issue with inserting the Spark dataframe into hive table.我在将 Spark 数据帧插入配置单元表时遇到问题。 Can anyone please help me out.任何人都可以帮助我。 HDP version 3.1, Spark version 2.3 Thanks in advance. HDP 3.1 版,Spark 2.3 版提前致谢。

//ORIGNAL CODE PART //原始代码部分

import org.apache.spark.SparkContext;
import com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl;
import org.apache.spark.sql.DataFrame
import com.hortonworks.hwc.HiveWarehouseSession;
import org.apache.spark.sql.SparkSession$;

val spark = SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
**val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()**
/*
Some Transformation operations happend and the output of transformation is stored in VAL RESULT
/*
val result = {
  num_records
  .union(df.transform(profile(heatmap_cols2type)))
}

result.createOrReplaceTempView("out_temp"); //Create tempview

scala> result.show()
+-----+--------------------+-----------+------------------+------------+-------------------+
| type|              column|      field|             value|       order|               date|
+-----+--------------------+-----------+------------------+------------+-------------------+
|TOTAL|                 all|num_records|               737|           0|2019-12-05 18:10:12|
|  NUM|available_points_...|    present|               737|           0|2019-12-05 18:10:12|

hive.setDatabase("EXAMPLE_DB")
hive.createTable("EXAMPLE_TABLE").ifNotExists().column("`type`", "String").column("`column`", "String").column("`field`", "String").column("`value`","String").column("`order`", "bigint").column("`date`", "TIMESTAMP").create()

hive.executeUpdate("INSERT INTO TABLE EXAMPLE_DB.EXAMPLE_TABLE SELECT * FROM out_temp");

-----ERROR of Orginal code----------------
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:86 Table not found 'out_temp'**strong text**

What I tried as a alternative is: (As Hive and Spark use independent catalogues, by checking the documentation from HWC write operations)我尝试过的替代方法是:(由于 Hive 和 Spark 使用独立的目录,通过检查 HWC 写入操作中的文档)

spark.sql("SELECT type, column, field, value, order, date FROM out_temp").write.format("HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR").option("table", "wellington_profile").save() spark.sql("SELECT type, column, field, value, order, date FROM out_temp").write.format("HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR").option("table", "wellington_profile").save()

-------ERROR of Alternative Step---------------- java.lang.ClassNotFoundException: Failed to find data source: HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR. -------替代步骤的错误---------------- java.lang.ClassNotFoundException:无法找到数据源:HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR。 Please find packages at http://spark.apache.org/third-party-projects.html at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:639) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:241) ... 58 elided Caused by: java.lang.ClassNotFoundException: HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR.DefaultSource请在http://spark.apache.org/third-party-projects.html org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:639) org.apache 上找到包。 spark.sql.DataFrameWriter.save(DataFrameWriter.scala:241) ... 58 被忽略 引起:java.lang.ClassNotFoundException:HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR.DefaultSource

My Question is:我的问题是:

Instead of saving the out_temp as a tempview in Spark is there any way to directly create the table in hive ?除了将 out_temp 保存为 Spark 中的临时视图之外,有没有办法直接在 hive 中创建表? Is there any way to insert into Hive table from spark dataframe ?有什么方法可以从 spark 数据帧插入到 Hive 表中吗?

Thank you everyone for your time!谢谢大家的时间!

result.write.save("example_table.parquet")

result.write.mode(SaveMode.Overwrite).saveAsTable("EXAMPLE_TABLE")

您可以从这里阅读更多详细信息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM