Spark中的Hive表

Question

I am running following job in HDP. 我正在HDP中追踪工作。

export SPARK-MAJOR-VERSION=2
spark-submit --class com.spark.sparkexamples.Audit --master yarn --deploy-mode cluster \
--files /bigdata/datalake/app/config/metadata.csv BRNSAUDIT_v4.jar dl_raw.ACC /bigdatahdfs/landing/AUDIT/BW/2017/02/27/ACC_hash_total_and_count_20170227.dat TH 20170227

Its failing with error that: 其失败并显示以下错误：

*Table or view not found: `dl_raw`.`ACC`; line 1 pos 94;
'Aggregate [count(1) AS rec_cnt#58L, 'count('BRCH_NUM) AS hashcount#59, 'sum('ACC_NUM) AS hashsum#60]
+- 'Filter (('trim('country_code) = trim(TH)) && ('from_unixtime('unix_timestamp('substr('bus_date, 0, 11), MM/dd/yyyy), yyyyMMdd) = 20170227))
   +- 'UnresolvedRelation `dl_raw`.`ACC'*

Whereas table is present in Hive and it is accessible from spark-shell. 该表存在于Hive中，可以从spark-shell进行访问。

UPD. UPD。

    val sparkSession = SparkSession.builder
.appName("spark session example")
.enableHiveSupport()
.getOrCreate() 
sparkSession.conf.set("spark.sql.crossJoin.enabled", "true") 

val df_table_stats = sparkSession.sql("""select count(*) as rec_cnt,count(distinct BRCH_NUM) as hashcount,
sum(ACC_NUM) as hashsum 
from dl_raw.ACC 
where trim(country_code) = trim('BW') 
and from_unixtime(unix_timestamp(substr(bus_date,0,11),'MM/dd/yy‌yy'),'yyyyMMdd')='20‌170227'""")

Answer 1

提交作业时，在--file参数中包含hive-site.xml文件

Spark中的Hive表

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-01-22 04:51:15

Spark中的Hive表

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-01-22 04:51:15

解决方案1
0 已采纳 2018-01-22 04:51:15