简体   繁体   English

hadoop hdfs java-将文件列表从hdfs复制到hdfs的最佳方法是什么

[英]hadoop hdfs java - what is the best way to copy a list of files from hdfs to hdfs

I have a file that has two columns, the first column is the hdfs path to source files, the second column is the hdfs path to target file: 我有一个包含两列的文件,第一列是源文件的hdfs路径,第二列是目标文件的hdfs路径:

s1, t1 s2, t2 .., .. sn, tn s1,t1 s2,t2 ..,.. sn,tn

what is the fastest way for me to copy the source path to their respective target path. 我将源路径复制到各自目标路径的最快方法是什么? is there such a tool for hadoop? 有这样的Hadoop工具吗?

The list is probably 100-200 lines long each file is a few megabytes. 该列表可能长100-200行,每个文件只有几兆字节。

The list is probably 100-200 lines long each file is a few megabytes. 该列表可能长100-200行,每个文件只有几兆字节。

If this a one-off kind of situation, then this isn't large enough to worry about. 如果这是一次性的情况,那么这还不足以担心。 A dumb ol' shell loop will do fine: 笨拙的shell循环会很好:

cat pairs-file | while read pair; do hdfs dfs -cp $pair; done

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM