简体   繁体   English

Sqoop-将所有表从mysql导入到配置单元

[英]Sqoop - Import all tables from mysql to hive

I have three tables in my mysql db: 我的mysql数据库中有三个表:

parent_table with two join tables: foo , bar , where parent_table has many foo 's and bar 's and foo , bar belong to parent_table . 具有两个parent_table表的parent_tablefoobar ,其中parent_table具有许多foobar以及foobar属于parent_table

How can I use sqoop, or an alternative method to import these tables into hive to be queried. 如何使用sqoop或另一种方法将这些表导入要查询的配置单元中。

Here is the example: 这是示例:

sqoop import-all-tables \
  --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
  --username=retail_dba \
  --password=cloudera \
  --warehouse-dir=/user/hive/warehouse/retail_stage.db

retail_db in --connect clause is mysql database retail_dba in --username clause is mysql user who have access to read tables under retail_db mysql database cloudera in --password clause is password for mysql user retail_dba /user/hive/warehouse/retail_stage.db in --warehouse-dir is hadoop directory (in this case it is hive database, one can have any valid hadoop directory). --connect子句中的retail_db是mysql数据库–username子句中的mysql_retail_dba是有权访问retail_db下的表的mysql用户–password子句中的mysqldatabase cloudera是mysql用户的密码retail_dba /user/hive/warehouse/retail_stage.db --warehouse-dir中的hadoop目录(在本例中为hive数据库,可以有任何有效的hadoop目录)。 Above script will create directory for each of the mysql table under /user/hive/warehouse/retail_stage.db. 上面的脚本将在/user/hive/warehouse/retail_stage.db下为每个mysql表创建目录。

You can run this script as is in Cloudera Quickstart VM. 您可以像在Cloudera Quickstart VM中一样运行此脚本。

You can start by taking a look into Sqoop User Guide that describes how to use Sqoop or more use case oriented book Apache Sqoop Cookbook . 您可以先阅读《 Sqoop用户指南》 ,其中介绍了如何使用Sqoop或更多面向用例的书Apache Sqoop Cookbook Both sources should be able to help you understand what needs to be done for this case. 两种资源都应该能够帮助您了解这种情况下需要做什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM