简体   繁体   English

如何将大量文件从S3文件夹复制到另一个文件夹

[英]How to copy large amount of files from S3 folder to another

I'm trying to move large amount of files(around 300Kb max size each file) from S3 folder to another. 我正在尝试将大量文件(每个文件最大大小300Kb)从S3文件夹移动到另一个文件夹。

I'm using AWS sdk for java, and tried to move around 1500 files. 我正在使用AWS sdk for java,并尝试移动1500个文件。

it took too much time, and the number of files may be increase to 10,000. 花了太多时间,文件数量可能会增加到10,000。

for each copy of file, need to delete from the source folder as there is no method to move file. 对于每个文件副本,需要从源文件夹中删除,因为没有移动文件的方法。

this what i tried: 这是我试过的:

public void moveFiles(String fromKey, String toKey) {
    Stream<S3ObjectSummary> objectSummeriesStream = this.getObjectSummeries(fromKey);
    objectSummeriesStream.forEach(file ->
        {
            this.s3Bean.copyObject(bucketName, file.getKey(), bucketName, toKey);
            this.s3Bean.deleteObject(bucketName, file.getKey());
        });

}

private Stream<S3ObjectSummary> getObjectSummeries(String key) {

    // get the files that their prefix is "key" (can be consider as Folders).
    ListObjectsRequest listObjectsRequest = new ListObjectsRequest().withBucketName(this.bucketName)
        .withPrefix(key);
    ObjectListing outFilesList = this.s3Bean.listObjects(listObjectsRequest);
    return outFilesList.getObjectSummaries()
        .stream()
        .filter(x -> !x.getKey()
            .equals(key));
}

If you are using Java application you can try to use several threads to copy files: 如果您使用的是Java应用程序,则可以尝试使用多个线程来复制文件:

private ExecutorService executorService = Executors.fixed(20);

public void moveFiles(String fromKey, String toKey) {
    Stream<S3ObjectSummary> objectSummeriesStream = 
    this.getObjectSummeries(fromKey);
    objectSummeriesStream.forEach(file ->
    {
        executorService.submit(() ->
            this.s3Bean.copyObject(bucketName, file.getKey(), bucketName, toKey);
            this.s3Bean.deleteObject(bucketName, file.getKey());
        )};
    });

}

This should speed up the process. 这应该加快这个过程。

An alternative might be using AWS-lambda. 另一种方法可能是使用AWS-lambda。 Once the file appear in source bucket you can, for example, put event in the SQS FIFO queue. 一旦文件出现在源存储桶中,您就可以将事件放入SQS FIFO队列中。 The lambda will start single file copy by this event. lambda将通过此事件启动单个文件副本。 If I am not mistaken in parallel you can start up to 500 instances of lambdas. 如果我没有并行错误,你可以启动多达500个lambdas实例。 Should be fast. 应该快。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将文件从 box 文件夹复制到 AWS s3 存储桶 - Copy files from box folder to AWS s3 bucket 如何使用 aws java sdk 将文件从 S3 存储桶从一个区域复制到另一个区域? - How to copy files from S3 bucket from one region to another region using aws java sdk? 将Amazon s3对象复制到另一个存储桶中的另一个文件夹。 - Copy an Amazon s3 object to another folder in another bucket. 如何使用Java将文件从sftp复制到亚马逊s3 - how to copy files from sftp to amazon s3 using java 将文件从 ec2 复制到 s3 - Copy files from ec2 to s3 如何根据文件名中的某些字符将一个文件夹中的大量.txt文件复制并移动到Java中的多个不同子文件夹中? - How to copy and move large amount of .txt files in one folder to multiple different subfolders in Java, based on certain characters in their filename? ResourcePatternResolver 未列出 s3 中文件夹中的文件 - ResourcePatternResolver Not listing files from a folder in s3 如何使用Java将文件从一个文件夹复制到另一个文件夹? - How to copy files from one folder to another using Java? 如何在java中将S3对象列表从一个文件夹移动/复制到文件夹? - How to move/copy list of S3 Objects from one folder to folder in java? 使用 Java (Amazon S3) 将 all.txt 文件从一个 object 复制到另一个但在同一个存储桶中 - Copy all .txt files from one object to another but in the same bucket using Java (Amazon S3)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM