简体   繁体   中英

How to read a file from minIO in apache beam java sdk

I just started with minio and apache beam. I have created a bucket on play.min.io and added few files (let suppose files stored are one.txt and two.txt). I want to access the files stored on that bucket with Apache beam java sdk. When i deal with local files i just pass the path of file like C://new//.. but i don't know how to get files from minio. Can anyone help me with the code.

I managed to have it work with some configurations on top of the standard AWS configuration:

  1. AwsServiceEndpoint should point to your minio server (here localhost:9000).
    PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create();
    ...
    options.as(AwsOptions.class).setAwsServiceEndpoint("http://localhost:9000");
  1. PathStyleAccess has to be enabled (so that bucket access does not translate to a request to " http://bucket.localhost:9000 " but to " http://localhost:9000/bucket ").

This can be done by extending DefaultS3ClientBuilderFactory with this kind of MinioS3ClientBuilderFactory:

public class MinioS3ClientBuilderFactory extends DefaultS3ClientBuilderFactory {
  @Override
  public AmazonS3ClientBuilder createBuilder(S3Options s3Options) {
    AmazonS3ClientBuilder builder = super.createBuilder(s3Options);
    builder.withPathStyleAccessEnabled(true);
    return builder;
  }
}

and inject it in the options like this:

    Class<? extends S3ClientBuilderFactory> builderFactory = MinioS3ClientBuilderFactory.class;

    options.as(S3Options.class).setS3ClientFactoryClass(builderFactory);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM