I am trying to find some docs / description of the approach on the subject, please help. I have Hadoop 2.2.0 from Hortonworks installed with some existing Hive tables I need to query. Hive SQL works extremly and unreasonably slow on single node and cluster as well. I hope Shark will work faster.
From Spark/Shark docs I can not figure out how to make Shark work with existing Hive tables. Any ideas how to achieve this? Thanks!
You need to configure the metastore within the shark-specific hive directory. Details are provided at a similar question I answered here .
In summary, you will need to copy the hive-default.xml to hive-site.xml . Then ensure the metastore properties are set.
Here is the basic info in hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://myhost/metastore</value>
<description>the URL of the MySQL database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mypassword</value>
</property>
You can get more details here: configuring hive metastore
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.