简体   繁体   English

将多个文件从HDFS复制到本地:多线程?

[英]Copy Multiple files from HDFS to local: Multithreading?

In my Java application, I need to copy multiple files from HDFS to Local File System. 在我的Java应用程序中,我需要将多个文件从HDFS复制到本地文件系统。

Which of the below two approaches will be faster ? 以下两种方法中哪一种会更快? 1. Sequentially copy files one-by-one 2. Run parallel threads to copy each file. 1.依次复制文件2.运行并行线程以复制每个文件。

If you have one physical disk as part of your local file system than a sequential approach would be best, as a parallel approach would cause the disk (in the case of a hard drive) to spin back and forth unnecessarily (depending on how much the OS can help you or not and the nature of the writes), and also because you would only have one physical resource to work with at a time, so one thread would be good enough. 如果您有一个物理磁盘作为本地文件系统的一部分,则最好采用顺序方法,因为并行方法会导致磁盘(在硬盘驱动器的情况下)不必要地来回旋转(取决于物理磁盘的容量)。操作系统可以帮助您或不帮助您以及写的性质),还因为您一次只能使用一个物理资源,所以一个线程就足够了。

If this local file system has multiple physical disks, then the possibility of running parallel threads for more performance could be ideal (like Thread A writes all files that are going to drive C, while thread B writes all files that are going to drive D). 如果此本地文件系统具有多个物理磁盘,则运行并行线程以提高性能的可能性可能是理想的(例如线程A写入将要驱动器C的所有文件,而线程B写入将要驱动器D的所有文件) 。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM