简体繁体中英

HDFS buffered write/read operations

原文 2015-05-29 08:21:23 3 1 java/ hadoop/ hdfs

I am using the HDFS Java API and FSDataOutput and FSDataInput streams to write/read files to a Hadoop 2.6.0 cluster of 4 machines.

The FS stream implementations have a bufferSize constructor parameter which I assume is for the internal cache of the stream. But it seems that it has absolutely no effect at all to the write/read speed, regardless of its value (I tried values between 8KB and up to several MBytes).

I was wondering if there is some way to achieve buffered write/read to HDFS cluster, different from wrapping the FSDataOutput/Input into BufferedOutput/Input streams?

1 answers

I have found the answer.

The bufferSize parameter of the FileSystem.create() is actually io.file.buffer.size which as we can read from the documentation is:

"The size of buffer for use in sequence files. The size of this buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations."

From the book "Hadoop: The Definitive Guide" we can read that a good starting point is setting it to 128KB.

As for the internal cache in the client side: Hadoop transmits data in the form of packets (default size is 64KB). This parameter can be tweaked with the dfs.client-write-packet-size option in the Hadoop hdfs-site.xml configuration. For my purposes I used 4MB.

Java with HDFS file read/write

Read/write operations with capture of ? in Java

times the read & write operations for a file

Java Read and Write Spark Vector's to Hdfs

HDFS guaranteed read/write of data from/to file

Creation of JSON service to read/write HDFS data

How to read and write a file in java using buffered reader and bufferedwriter

How to lock table between read and write operations

How to track Firestore read/write operations?

Minimize loops in reference to read/write operations

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Java with HDFS file read/write Read/write operations with capture of ? in Java times the read & write operations for a file Java Read and Write Spark Vector's to Hdfs HDFS guaranteed read/write of data from/to file Creation of JSON service to read/write HDFS data How to read and write a file in java using buffered reader and bufferedwriter How to lock table between read and write operations How to track Firestore read/write operations? Minimize loops in reference to read/write operations

Related Tags

HDFS buffered write/read operations

Question

1 answers

solution1 3 2015-06-02 06:48:20

solution1
3 2015-06-02 06:48:20