How to read S3 file chunk by chunk in java?

Question

I have a use case where I have one S3 file. The size is not large enough but it can contains 10-50 million single row records. I want to read a specific byte range. I have read that we can use Range header in S3 GetObject. Like this:

 final GetObjectRequest request = new GetObjectRequest(s3Bucket, key);
            request.withRange(byteStartRange, byteEndRange);
            return s3Client.getObject(request);

But want to know, does the byte range always guarantees a complete line?

For eg:

My S3 file content is :

dhjdjdjdjdk
djdjjdfddkkd
dhdjjdjdjdd
cjjjdjdddd
......

If I specify the byte range to be some range X to Y, Will it guarantee full line read of it can read some incomplete line which falls in the byte range?

Answer 1

No, the Range will not guarantee a complete line.

It will provide back only the specific range of bytes requested. Amazon S3 has no insight into the contents of a file. It cannot parse/recognize newline characters.

You will need to request a large enough range that it (hopefully) contains a complete line. Then your code would need to determine where the line ends and the next line begins.

How to read S3 file chunk by chunk in java?

Question

1 answers

solution1
0 2021-06-24 06:41:04

How to read S3 file chunk by chunk in java?

Question

1 answers

solution1 0 2021-06-24 06:41:04

solution1
0 2021-06-24 06:41:04