简体   繁体   中英

How to read S3 file chunk by chunk in java?

I have a use case where I have one S3 file. The size is not large enough but it can contains 10-50 million single row records. I want to read a specific byte range. I have read that we can use Range header in S3 GetObject. Like this:

 final GetObjectRequest request = new GetObjectRequest(s3Bucket, key);
            request.withRange(byteStartRange, byteEndRange);
            return s3Client.getObject(request);

But want to know, does the byte range always guarantees a complete line?

For eg:

My S3 file content is :

  • dhjdjdjdjdk
  • djdjjdfddkkd
  • dhdjjdjdjdd
  • cjjjdjdddd
  • ......

If I specify the byte range to be some range X to Y, Will it guarantee full line read of it can read some incomplete line which falls in the byte range?

No, the Range will not guarantee a complete line.

It will provide back only the specific range of bytes requested. Amazon S3 has no insight into the contents of a file. It cannot parse/recognize newline characters.

You will need to request a large enough range that it (hopefully) contains a complete line. Then your code would need to determine where the line ends and the next line begins.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM