简体   繁体   English

使用AWS CLI的AWS S3数据传输

[英]AWS S3 data transfer using AWS CLI

我正在尝试使用AWS CLI将25 TB的s3中存储的数据从一个AWS账户转移到另一个AWS账户中的另一个s3存储桶(两个账户在不同地区),有人可以建议我使用哪个EC2实例更好地处理数据使用CLI进行传输,主要是完成传输需要多少时间。

Copying files 复制文件

Copying is the easy part! 复制是容易的部分! Use the AWS Command-Line Interface (CLI) : 使用AWS命令行界面(CLI)

aws s3 sync s3://source-bucket s3://destination-bucket

The data will be transferred directly between the buckets - the data will not be downloaded & uploaded. 数据将直接在存储桶之间传输- 不会下载和上传数据。 Therefore, it doesn't matter what size EC2 instance you use -- you can even run the command from your own computer and will be just as fast. 因此,无论您使用多大的EC2实例大小都没关系-您甚至可以从自己的计算机上运行命令,并且速度会一样快。 The CLI will send the necessary Copy commands to S3 for each file to be copied. CLI将向每个要复制的文件发送必要的复制命令到S3。

Using the sync command has the benefit that the copy can be resumed if something goes wrong, since it only copies files that are missing or updated since the previous sync. 使用sync命令的好处是,如果出现问题,可以恢复副本,因为它仅复制自上次同步以来丢失或更新的文件。

Permissions 权限

What you will need to consider is how to permit access to copy the files. 什么,你需要考虑的是如何允许访问复制文件。 Let's say you have: 假设您有:

  • Account A with Bucket A 帐户A与存储桶A
  • Account B with Bucket B 桶B的帐户B
  • You wish to copy from Bucket A to Bucket B 您希望从存储桶A复制到存储桶B

You should run the sync command from a user ("User B") in Account B that has permissions to write to Bucket B. 您应从有权写入存储桶B的帐户B中的用户(“用户B”)运行sync命令。

You will also need to add a Bucket Policy to Bucket A that specifically permits access by User B. The policy would look something like: 您还需要向存储桶A添加一个存储桶策略,该策略专门允许用户B进行访问。该策略如下所示:

{
  "Id": "Policy1",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadOnlyAccess",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Principal": {
        "AWS": [
          "arn:aws:iam::123456789012:user/user-b"
        ]
      }
    }
  ]
}

The arn value is the ARN of User B. arn值为用户B的ARN。

Timing 定时

The transfer will be faster if the buckets are in the same region. 如果存储桶位于同一区域,则传输会更快。 However, I have no idea how long the transfer will take. 但是,我不知道转移需要多长时间。 25TB is actually a lot of data! 25TB实际上是很多数据! (Have you ever tried copying 1TB of data on a computer? It is slow!) (您是否曾经尝试在计算机上复制1TB数据?速度很慢!)

The nice thing is that you can use the aws s3 sync command multiple times. 令人高兴的是,您可以多次使用aws s3 sync命令。 Let's say you need the transfer to happen over a weekend. 假设您需要在周末进行转移。 You could run the command during the week, and then run it again on the weekend. 您可以在一周中运行该命令,然后在周末再次运行它。 Only files that have been added/changed would be copied, so the final copy window would be quite small. 仅复制已添加/更改的文件,因此最终复制窗口将很小。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM