简体   繁体   English

无法使用 spark SQL 创建表:CREATE Hive TABLE (AS SELECT) 需要 Hive 支持;

[英]Cannot Create table with spark SQL : Hive support is required to CREATE Hive TABLE (AS SELECT);

I'm trying to create a table in spark (scala) and then insert values from two existing dataframes but I got this exeption:我正在尝试在 spark (scala) 中创建一个表,然后从两个现有数据帧中插入值,但我得到了这个例外:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Hive support is required to CREATE Hive TABLE (AS SELECT);;
'CreateTable `stat_type_predicate_percentage`, ErrorIfExists 

Here is the code :这是代码:

case class stat_type_predicate_percentage (type1: Option[String], predicate: Option[String], outin: Option[INT], percentage: Option[FLOAT])
object LoadFiles1 {

 def main(args: Array[String]) {
    val sc = new SparkContext("local[*]", "LoadFiles1") 
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val warehouseLocation = new File("spark-warehouse").getAbsolutePath
    val spark = SparkSession
        .builder()
        .appName("Spark Hive Example")
        .config("spark.sql.warehouse.dir", warehouseLocation)
        .enableHiveSupport()
        .getOrCreate()       

import sqlContext.implicits._    
import org.apache.spark.sql._       
import org.apache.spark.sql.Row;
import org.apache.spark.sql.types.{StructType,StructField,StringType};

//statistics 
val create = spark.sql("CREATE TABLE stat_type_predicate_percentage (type1 String, predicate String, outin INT, percentage FLOAT) USING hive")
val insert1 = spark.sql("INSERT INTO stat_type_predicate_percentage SELECT types.type, res.predicate, 0, 1.0*COUNT(subject)/(SELECT COUNT(subject) FROM MappingBasedProperties AS resinner WHERE res.predicate = resinner.predicate) FROM MappingBasedProperties AS res, MappingBasedTypes AS types WHERE res.subject = types.resource GROUP BY res.predicate,types.type")

val select = spark.sql("SELECT * from stat_type_predicate_percentage" ) 
  }

How should I solve it?我该如何解决?

--- Yo have to enable hive support in you sparksession --- 你必须在 sparksession 中启用 hive 支持

val spark = new SparkSession
    .Builder()
      .appName("JOB2")
      .master("local")
      .enableHiveSupport()
      .getOrCreate()

This problem may be two fold for one you might want to do what @Tanjin suggested in the comments and it might work afterwards ( Try adding .config("spark.sql.catalogImplementation","hive") to your SparkSession.builder ) but if you actually want to use an existing hive instance with its own metadata which you'll be able to query from outside your job.对于您可能想要按照@Tanjin 在评论中建议的操作,此问题可能有两方面的问题,之后可能会起作用(尝试将.config("spark.sql.catalogImplementation","hive")到您的SparkSession.builder )但是如果你真的想使用一个现有的配置单元实例和它自己的元数据,你将能够从你的工作之外查询。 Or you might already want to use existing tables you might like to add to you configuration the hive-site.xml.或者您可能已经想使用现有的表,您可能想添加到配置 hive-site.xml 中。

This configuration file contains some properties you probably want like the hive.metastore.uris which will enable your context add a new table which will be save in the store.这个配置文件包含一些你可能想要的属性,比如 hive.metastore.uris,这将使你的上下文添加一个将保存在商店中的新表。 And it will be able to read from tables in your hive instance thanks to the metastore which contains tables and locations.由于包含表和位置的元存储,它将能够从您的配置单元实例中的表中读取。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM