[英]How do I sync a mysql table to a hive table ? (sqoop --incremental lastmodified hive imports is not supported)
I want to sync a mysql table into hive table. 我想将mysql表同步到配置单元表中。 Because records in
orders
table usually changed in nearly future . 因为
orders
表中的记录通常会在不久的将来发生变化。 I need update them into hive . 我需要将它们更新为蜂巢。
For example , 例如 ,
time_update
is in nearly 1 days, and update them into hive table. time_update
更改记录,并将其更新到配置单元表中。 I have tried --incremental lastmodified
like below 我已经尝试过-
--incremental lastmodified
如下
sqoop import \
"-Dorg.apache.sqoop.splitter.allow_text_splitter=true" \
--connect $DB_URL \
--username $USERNAME \
--password $PASSWORD \
--direct \
--fields-terminated-by '\t' \
--target-dir '/data/hive/' \
--delete-target-dir \
--hive-database $HIVE_DB \
--hive-table $HIVE_TABLE \
--hive-import \
--hive-overwrite \
--create-hive-table \
--query 'select * from '$HIVE_TABLE' where $CONDITIONS' \
--split-by id \
-m 6 \
--merge-key id \
--incremental lastmodified \
--check-column time_update \
--last-value "2019-01-01 21:00:00"
Got error --incremental lastmodified option for hive imports is not supported. Please remove the parameter --incremental lastmodified.
出现错误-
--incremental lastmodified option for hive imports is not supported. Please remove the parameter --incremental lastmodified.
--incremental lastmodified option for hive imports is not supported. Please remove the parameter --incremental lastmodified.
What is the proper way to do without --incremental lastmodified option
. 没有
--incremental lastmodified option
的正确方法是什么?
First, you have to remove --delete-target-dir and --create-hive-table arguments as in incremental import, the target dir will stay as it is so --delete-target-dir will not work with --incremental argument. 首先,您必须像在增量导入中一样删除--delete-target-dir和--create-hive-table参数,目标目录将保持原样,因此--delete-target-dir无法与--incremental一起使用论点。 Also, hive-table should be created once only so you have to remove --create-hive-table argument and create hive table manually in hive with same schema, take the location of that schema and use it as --target-dir.
此外,配置单元表只能创建一次,因此您必须删除--create-hive-table参数,并在具有相同模式的配置单元中手动创建配置单元表,获取该模式的位置并将其用作--target-dir。
sqoop import \
--connect <<db_url>> \
--username <<username>> \
--password <<password>> \
--direct \
--fields-terminated-by '\t' \
--hive-database <<hive_db>> \
--hive-table <<hive_table>> \
--hive-import \
--hive-overwrite \
--query 'select * from <<db_table>> where $CONDITIONS' \
--split-by product_id \
-m 6 \
--merge-key product_id \
--incremental lastmodified \
--check-column timedate \
--last-value 0 \
--target-dir /user/hive/warehouse/problem5.db/products_hive (<<hive_table_location>>)
This will work successfully, if not let me know. 如果不告诉我,它将成功运行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.