简体   繁体   English

将配置单元查询结果从一个hadoop集群转移到另一个hadoop集群

[英]Transfer hive query result from one hadoop cluster to another hadoop cluster

I have two clusters A and B. Cluster A has 5 tables. 我有两个集群A和B。集群A有5个表。 Now I need to run a hive query on these 5 tables, result of the query should update the cluster B Table data(covers all the columns of result query) 现在我需要在这5个表上运行配置单元查询,查询的结果应该更新群集B的表数据(覆盖结果查询的所有列)

Note: We should not create any files on cluster A during this process but temp file is allowed. 注意:在此过程中,我们不应在集群A上创建任何文件,但允许使用临时文件。

Can this doable? 这可行吗? What are permissions/Configurations required between two clusters two achieve this? 要实现此目标,两个集群之间需要什么权限/配置?

How Can I get this task/Any other efficient alternative? 如何获得此任务/任何其他有效的选择?

After achieving this task, I should automate using Oozie.. 完成此任务后,我应该自动使用Oozie ..

Do you use a database for each cluster's metadata or hive tables? 您是否为每个集群的元数据或配置单元表使用数据库? If yes then - if you use the same database for storing hive tables in both clusters then you can share them. 如果是,那么-如果您使用相同的数据库在两个群集中存储配置单元表,则可以共享它们。 I know it sounds intuitive, but just mentioned it incase you haven't thought about it. 我知道这听起来很直观,但是只是提到了它,以防您不考虑它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM