简体   繁体   中英

Read and write to a file in Amazon s3 bucket

I need to read a large (>15mb) file (say sample.csv ) from an Amazon S3 bucket. I then need to process the data present in sample.csv and keep writing it to another directory in the S3 bucket. I intend to use an AWS Lambda function to run my java code.

As a first step I developed java code that runs on my local system. The java code reads the sample.csv file from the S3 bucket and I used the put method to write data back to the S3 bucket. But I find only the last line was processed and put back.

Region clientRegion = Region.Myregion;    
AwsBasicCredentials awsCreds = AwsBasicCredentials.create("myAccessId","mySecretKey");    
S3Client s3Client = S3Client.builder().region(clientRegion).credentialsProvider(StaticCredentialsProvider.create(awsCreds)).build();    
ResponseInputStream<GetObjectResponse> s3objectResponse = s3Client.getObject(GetObjectRequest.builder().bucket(bucketName).key("Input/sample.csv").build());    
BufferedReader reader = new BufferedReader(new InputStreamReader(s3objectResponse));    
String line = null;
while ((line = reader.readLine()) != null) {
                s3Client.putObject(PutObjectRequest.builder().bucket(bucketName).key("Test/Testout.csv").build(),RequestBody.fromString(line));
}

Example: sample.csv contains

1,sam,21,java,beginner;
2,tom,28,python,practitioner;
3,john,35,c#,expert.

My output should be

1,mas,XX,java,beginner;
2,mot,XX,python,practitioner;
3,nhoj,XX,c#,expert. 

But only 3,nhoj,XX,c#,expert is written in the Testout.csv .

The putObject() method creates an Amazon S3 object.

It is not possible to append or modify an S3 object, so each time the while loop executes, it is creating a new Amazon S3 object.

Instead, I would recommend:

  • Download the source file from Amazon S3 to local disk (use GetObject() with a destinationFile to download to disk)
  • Process the file and output to a local file
  • Upload the output file to the Amazon S3 bucket ( method )

This separates the AWS code from your processing code, which should be easier to maintain.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM