I am trying to set up a spark action workflow within apache oozie though I'm getting the following error when select * from db.table
is called within my spark code in a hive context:
org.apache.spark.sql.AnalysisException: Table not found: `db`.`table`; line 1 pos 34
This spark job works with spark-submit so I can't seem to nail down the issue. I've added hive-site.xml to various locations recommended in previous questions such as the workspace lib directory and the workspace directory and added it to the job.xml setting though I still get the same issue.
I'm running in deploy mode cluster and master yarn.
I've tried many combinations and not sure what else to do.
Where am I going wrong?
It is necessary to add the Hive configuration. For example, adding in the action of the workflow de file where it is.
<spark xmlns="uri:oozie:spark-action:1.0">
<!-- ... ->
<file>${hiveConfig}</file>
</spark>
In job.properties must be the reference:
hiveConfig=/user/oozie/extraconfig/hive-site.xml
This file must be in each node of cluster
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.