简体   繁体   中英

Using Spark to Read from Hive

Problem

I am attempting to read from a Hive table, but am receiving the following error:

[error] (run-main-0) org.apache.spark.sql.AnalysisException: Table or view not found: tags; line 1 pos 14

I have placed hive-site.xml in both $SPARK_HOME/conf and $HIVE_HOME/conf . As well, I had no trouble using sqoop to grab the data from mysql and importing it into hive. Is something wrong with my Scala code? Or is this a config error?

Scala Code:

package test1

import java.io.File
import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession

case class Movie(movieid: String, title: String, genres: String)
case class Tag(userid: String, title: String, tag: String)

object SparkHiveTest {
    def main(args: Array[String]) {
        val warehouseLocation = new File("spark-warehouse").getAbsolutePath
        val spark = SparkSession
            .builder()
            .master("local")
            .appName("SparkHiveExample")
            .config("spark.sql.warehouse.dir", warehouseLocation)
            .enableHiveSupport()
            .getOrCreate()

        spark.sql("SELECT * FROM tags").show()                      
        spark.stop()
    }
}

hive-site.xml:

<configuration>

   <property>

      <name>javax.jdo.option.ConnectionURL</name>

      <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>

      <description>metadata is stored in a MySQL server</description>

   </property>

   <property>

      <name>javax.jdo.option.ConnectionDriverName</name>

      <value>com.mysql.jdbc.Driver</value>

      <description>MySQL JDBC driver class</description>

   </property>

   <property>

      <name>javax.jdo.option.ConnectionUserName</name>

      <value>hiveuser</value>

      <description>user name for connecting to mysql server</description>

   </property>

   <property>

      <name>javax.jdo.option.ConnectionPassword</name>

      <value>hivepass</value>

      <description>password for connecting to mysql server</description>

   </property>

</configuration>

Make sure your Hive metastore is configured properly:

<configuration>
  <property>
    <name>hive.metastore.uris</name>
    <value>HIVE METASTORE URI(S) HERE</value>
    <description>URI for client to contact metastore server</description>
  </property>
</configuration>

According to the API Document for HiveContext :

An instance of the Spark SQL execution engine that integrates with data stored in Hive. Configuration for Hive is read from hive-site.xml on the classpath.

Therefore, be sure to put your hive-site.xml into the resource folder of your project in your IDE.

It solved my problem immediately.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM