简体   繁体   中英

AWS S3 data transfer using AWS CLI

我正在尝试使用AWS CLI将25 TB的s3中存储的数据从一个AWS账户转移到另一个AWS账户中的另一个s3存储桶(两个账户在不同地区),有人可以建议我使用哪个EC2实例更好地处理数据使用CLI进行传输,主要是完成传输需要多少时间。

Copying files

Copying is the easy part! Use the AWS Command-Line Interface (CLI) :

aws s3 sync s3://source-bucket s3://destination-bucket

The data will be transferred directly between the buckets - the data will not be downloaded & uploaded. Therefore, it doesn't matter what size EC2 instance you use -- you can even run the command from your own computer and will be just as fast. The CLI will send the necessary Copy commands to S3 for each file to be copied.

Using the sync command has the benefit that the copy can be resumed if something goes wrong, since it only copies files that are missing or updated since the previous sync.

Permissions

What you will need to consider is how to permit access to copy the files. Let's say you have:

  • Account A with Bucket A
  • Account B with Bucket B
  • You wish to copy from Bucket A to Bucket B

You should run the sync command from a user ("User B") in Account B that has permissions to write to Bucket B.

You will also need to add a Bucket Policy to Bucket A that specifically permits access by User B. The policy would look something like:

{
  "Id": "Policy1",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadOnlyAccess",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Principal": {
        "AWS": [
          "arn:aws:iam::123456789012:user/user-b"
        ]
      }
    }
  ]
}

The arn value is the ARN of User B.

Timing

The transfer will be faster if the buckets are in the same region. However, I have no idea how long the transfer will take. 25TB is actually a lot of data! (Have you ever tried copying 1TB of data on a computer? It is slow!)

The nice thing is that you can use the aws s3 sync command multiple times. Let's say you need the transfer to happen over a weekend. You could run the command during the week, and then run it again on the weekend. Only files that have been added/changed would be copied, so the final copy window would be quite small.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM