[英]Sqoop - Import all tables from mysql to hive
I have three tables in my mysql db: 我的mysql数据库中有三个表:
parent_table
with two join tables: foo
, bar
, where parent_table
has many foo
's and bar
's and foo
, bar
belong to parent_table
. 具有两个
parent_table
表的parent_table
: foo
, bar
,其中parent_table
具有许多foo
和bar
以及foo
, bar
属于parent_table
。
How can I use sqoop, or an alternative method to import these tables into hive to be queried. 如何使用sqoop或另一种方法将这些表导入要查询的配置单元中。
Here is the example: 这是示例:
sqoop import-all-tables \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username=retail_dba \
--password=cloudera \
--warehouse-dir=/user/hive/warehouse/retail_stage.db
retail_db in --connect clause is mysql database retail_dba in --username clause is mysql user who have access to read tables under retail_db mysql database cloudera in --password clause is password for mysql user retail_dba /user/hive/warehouse/retail_stage.db in --warehouse-dir is hadoop directory (in this case it is hive database, one can have any valid hadoop directory). --connect子句中的retail_db是mysql数据库–username子句中的mysql_retail_dba是有权访问retail_db下的表的mysql用户–password子句中的mysqldatabase cloudera是mysql用户的密码retail_dba /user/hive/warehouse/retail_stage.db --warehouse-dir中的hadoop目录(在本例中为hive数据库,可以有任何有效的hadoop目录)。 Above script will create directory for each of the mysql table under /user/hive/warehouse/retail_stage.db.
上面的脚本将在/user/hive/warehouse/retail_stage.db下为每个mysql表创建目录。
You can run this script as is in Cloudera Quickstart VM. 您可以像在Cloudera Quickstart VM中一样运行此脚本。
You can start by taking a look into Sqoop User Guide that describes how to use Sqoop or more use case oriented book Apache Sqoop Cookbook . 您可以先阅读《 Sqoop用户指南》 ,其中介绍了如何使用Sqoop或更多面向用例的书Apache Sqoop Cookbook 。 Both sources should be able to help you understand what needs to be done for this case.
两种资源都应该能够帮助您了解这种情况下需要做什么。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.