简体   繁体   English

如何将文件从HDFS复制到远程HDFS

[英]How to copy files from HDFS to remote HDFS

I want to copy files from my Hadoop cluster to the remote cluster. 我想将文件从Hadoop集群复制到远程集群。

I have the hadoop_conf file in the remote cluster and can access it by setting HADOOP_CONF_DIR. 我在远程集群中有hadoop_conf文件,可以通过设置HADOOP_CONF_DIR.来访问它HADOOP_CONF_DIR.

I know the IP and port of the remote name node. 我知道远程名称节点的IP和端口。

I want to copy the file through the namespace as ex below. 我想通过命名空间复制文件,如下所示。

ex) hadoop fs -cp hdfs://MyNamespace/path/file hdfs://RemoteNamespace/path/file 例如)hadoop fs -cp hdfs:// MyNamespace / path / file hdfs:// RemoteNamespace / path / file

However, if I do not configure hadoop_conf_dir , I do not know the remote namespace, and if I set the information of remote cluster in hadoop_conf_dir , I can not access my cluster's namespace. 但是,如果不配置hadoop_conf_dir ,则不知道远程名称空间;如果在hadoop_conf_dir设置了远程集群的信息,则无法访问群集的名称空间。

Please let me know how to do it. 请让我知道该怎么做。

The typical way to copy between clusters is using distcp . 在群集之间进行复制的典型方法是使用distcp

$ hadoop distcp hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo

See DistCp Version2 Guide for more information. 有关更多信息,请参见DistCp版本2指南

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM