简体   繁体   中英

Is it a good practice for various AWS EC2 instances use the same library files in a AWS S3 bucket?

I'm migrating my computations to aws. I plan to have eg: 10 aws EC2 instances but they need to use eg: 1000 library files (each like 5MB-10 MB) in total circa 10GB of library files.

Shall I upload these 10GB of library files to each instance? Will it be more costly and faster?

Or, first create a S3 bucket, only upload 10GB of files to S3 bucket and then make the instances make use library files in S3 Bucket?

Or is it possible to upload library files to an EC2 instance (library instance) and make other EC2 instances to make use this library instance?

This can be a very stupid question. But since I'm in the very first days of aws, I need your good ideas.

Thanks in advance.

You can pre-build an Amazon Machine Image (AMI) that includes these libraries and then launch N instances from that AMI. That will be the fastest way to get the same libraries onto all of your EC2 instances.

You could also measure the time taken to download them all from S3 into EC2 as that, if reasonably quick, would reduce the overhead of having to maintain your AMI (as security patches become available, etc.)

To bootstrap files from S3 onto an EC2 instance, you can do this one-time at launch using a userdata script , or pull them on demand.

For more sophisticated options, you could look at the Code* services, for example CodeDeploy, multi-attached EBS volumes (these are read-only) or even EFS (which is probably overkill in your case).

There are quite a few ways you could approach this.

  1. Your idea of uploading each file to each instance would work just fine. To speed up the process, you could copy all the files to one instance (slow), and copy from there to each of the other instances (faster).

  2. Your idea to copy the files to S3, then load the files from S3 on each instance is a more typical approach. More efficient from a transfer standpoint since you'd only need to copy the files once to S3, then load in to each instance.

  3. Use Amazon Elastic File System (EFS) . Mount the volume on each of your instances. Copy the data on to the volume, then it will be accessible on all of them. Pros: No need to operate on the files in S3, available on all instances regardless of which availability zone they are in. Cons: more complex to set up.

  4. Attach an EBS volume to each instance using the recently launched multi-attach support . Pros: Easy to set configure, fastest option. Cons: All the instances must be located in the same availability zone.

Hey, at least you have some options!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM