简体   繁体   中英

How to read a zipped CSV file using Java inside an AWS S3 bucket?

I had a requirement where I had to read a .csv file from S3 bucket. I achieved it through

S3Object s3Obj = amazonS3Client.getObject(bucketname, fileName);
BufferedReader reader = new BufferedReader(new InputStreamReader(s3Obj.getObjectContent())); 

Now the same .csv file is in archived (zipped) form presented in AWS S3 bucket. I need to read this .csv file without unzip operations at my server-end.

Is there any documentation or API's present in AWS to read .csv file directly without unzipping it?

You can read a zipped CSV file directly from Amazon S3 with these steps:

  1. Get the object from S3
  2. Create a ZipInputStream with the object's data
  3. Create a Reader with the ZipInputStream

Example:

AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();  
S3Object object = s3Client.getObject("mybucket","myfile.csv.zip");  
ZipInputStream in = new ZipInputStream(object.getObjectContent());  
BufferedReader reader = new BufferedReader(new InputStreamReader(in));  

Because a zip file can contain many files within you will need to position the ZipInputStream at the beginning of each ZipEntry to read each contained file individually. (Even if your zip file contains only one file within you will need to do this once to place the ZipInputStream at the beginning of the lone ZipEntry.)

String line;
while (in.getNextEntry() != null) { // loop through each file within the zip
    while ((line = reader.readLine()) != null) { // loop through each line
        System.out.println(line);
    }
}

If in your example s3Obj.getObjectContent() returns a ZIP compressed file stream, than something similar should work to access it.

ZipInputStream in = new ZipInputStream(s3Obj.getObjectContent());
while ((entry = in.getNextEntry()) != null) {
    System.out.printf("entry: %s%n", entry.getName());
}
in.close();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM