简体   繁体   中英

Download a Large Number of Files Using the Java SDK for Amazon S3 Bucket

I have a large number of files that need to be downloaded from an S3 bucket. My problem is similar to this article except I am trying to run it in Java.

public static void main(String args[]) {
        AWSCredentials myCredentials = new BasicAWSCredentials("key","secret");
        TransferManager tx = new TransferManager(myCredentials);
        File file = <thefile>
        try{
        MultipleFileDownload myDownload = tx.downloadDirectory("<bucket>", null, file);
        System.out.println("Transfer: " + myDownload.getDescription());
        System.out.println("  - State: " + myDownload.getState());
        System.out.println("  - Progress: " + myDownload.getProgress().getBytesTransfered());

        while (myDownload.isDone() == false) {
           System.out.println("Transfer: " + myDownload.getDescription());
           System.out.println("  - State: " + myDownload.getState());
            System.out.println("  - Progress: " + myDownload.getProgress().getBytesTransfered());
            try {
                // Do work while we wait for our upload to complete...
                Thread.sleep(500);
            } catch (InterruptedException ex) {
                ex.printStackTrace();
            }
         }
         } catch(Exception e){
          e.printStackTrace();
         }

      }

This was adapted from the TransferManager class example for multiple upload. There are well over a 100,000 objects in this bucket. Any help would be great.

Please use the list() method to get a list of your files, then use the get() method to get each file.

class S3 extends AmazonS3Client {

    final String bucket;


    S3(String u, String p, String Bucket) {
        super(new BasicAWSCredentials(u, p));
        bucket = Bucket;
    }


    String get(String k) {
        try {
            final S3Object f = getObject(bucket, k);
            final BufferedInputStream i = new BufferedInputStream(f.getObjectContent());
            final StringBuilder s = new StringBuilder();
            final byte[] b = new byte[1024];
            for (int n = i.read(b); n != -1; n = i.read(b)) {
                s.append(new String(b, 0, n));
            }
            return s.toString();
        } catch (Exception e) {
            log("Cannot get " + bucket + "/" + k + " from S3 because " + e);
        }
        return null;
    }


    String[] list(String d) {
        try {
            final ObjectListing l = listObjects(bucket, d);
            final List<S3ObjectSummary> L = l.getObjectSummaries();
            final int n = L.size();
            final String[] s = new String[n];
            for (int i = 0; i < n; ++i) {
                final S3ObjectSummary k = L.get(i);
                s[i] = k.getKey();
            }
            return s;
        } catch (Exception e) {
            log("Cannot list " + bucket + "/" + d + " on S3 because " + e);
        }
        return new String[]{};
    }
}

TransferManager internally uses countdownlatch which makes me believe is does concurrent download (which seems the right way to do it). It makes sense to use it than get one file after other sequentially?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM