[英]Download a Large Number of Files Using the Java SDK for Amazon S3 Bucket
I have a large number of files that need to be downloaded from an S3 bucket. 我有大量需要从S3存储桶下载的文件。 My problem is similar to this article except I am trying to run it in Java. 我的问题类似于本文,除了我试图用Java运行它。
public static void main(String args[]) {
AWSCredentials myCredentials = new BasicAWSCredentials("key","secret");
TransferManager tx = new TransferManager(myCredentials);
File file = <thefile>
try{
MultipleFileDownload myDownload = tx.downloadDirectory("<bucket>", null, file);
System.out.println("Transfer: " + myDownload.getDescription());
System.out.println(" - State: " + myDownload.getState());
System.out.println(" - Progress: " + myDownload.getProgress().getBytesTransfered());
while (myDownload.isDone() == false) {
System.out.println("Transfer: " + myDownload.getDescription());
System.out.println(" - State: " + myDownload.getState());
System.out.println(" - Progress: " + myDownload.getProgress().getBytesTransfered());
try {
// Do work while we wait for our upload to complete...
Thread.sleep(500);
} catch (InterruptedException ex) {
ex.printStackTrace();
}
}
} catch(Exception e){
e.printStackTrace();
}
}
This was adapted from the TransferManager class example for multiple upload. 这是从TransferManager类示例改编而来的,用于多次上传。 There are well over a 100,000 objects in this bucket. 这个桶中有超过100,000个对象。 Any help would be great. 任何帮助都会很棒。
Please use the list() method to get a list of your files, then use the get() method to get each file. 请使用list()方法获取文件列表,然后使用get()方法获取每个文件。
class S3 extends AmazonS3Client {
final String bucket;
S3(String u, String p, String Bucket) {
super(new BasicAWSCredentials(u, p));
bucket = Bucket;
}
String get(String k) {
try {
final S3Object f = getObject(bucket, k);
final BufferedInputStream i = new BufferedInputStream(f.getObjectContent());
final StringBuilder s = new StringBuilder();
final byte[] b = new byte[1024];
for (int n = i.read(b); n != -1; n = i.read(b)) {
s.append(new String(b, 0, n));
}
return s.toString();
} catch (Exception e) {
log("Cannot get " + bucket + "/" + k + " from S3 because " + e);
}
return null;
}
String[] list(String d) {
try {
final ObjectListing l = listObjects(bucket, d);
final List<S3ObjectSummary> L = l.getObjectSummaries();
final int n = L.size();
final String[] s = new String[n];
for (int i = 0; i < n; ++i) {
final S3ObjectSummary k = L.get(i);
s[i] = k.getKey();
}
return s;
} catch (Exception e) {
log("Cannot list " + bucket + "/" + d + " on S3 because " + e);
}
return new String[]{};
}
}
TransferManager internally uses countdownlatch which makes me believe is does concurrent download (which seems the right way to do it). TransferManager内部使用countdownlatch,这让我相信并发下载(这似乎是正确的方式)。 It makes sense to use it than get one file after other sequentially? 使用它比依次获取一个文件更有意义吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.