简体   繁体   中英

How to programmatically download data to AWS EC2 instance?

There are 3 machines involved in my task

  • A : my desktop
  • B : EC2 instance spun up by A
  • C : a remote linux server where data sits and I only have read privilege

The task has basically 3 steps

  1. spin up B from A
  2. download data from C to B to a specific location
  3. change some of the downloaded data on B

I know how to do 1 using awscli or boto3. Steps 2 and 3 are easy if I ssh to the EC2 instance manually. The problem is that if this task needs to be automated, how can I deal with the login credentials.

Specifically, I am thinking of using user_data to run shell scripts after the EC2 instance is born, but the data download uses scp which needs password. Then I could upload an ssh credential file to the EC2 instance, but then I cannot utilize user_data to run the script for step 2 and 3.

So my current solution is all from shell script

  1. spin up B from A
  2. upload ssh credential from A to B
  3. ssh from A to B with shell commands attached where steps 2 and 3 for the task are performed

This solution appears really ugly to me. Is there a better practice in this case?

3 Options

  1. Pass the encrypted/encoded password as part of userdata. The userdata script will first decrypt/decode the password and use it to scp the file from C. Then delete the userdata or someway to delete the encrypted/encoded password
  2. Use ssh key instead of ssh password. But the risk is you have to pass the private key in the userdata. Not a secure way.
  3. Use Ansible and ssh key. But too much work for a simple task.

There are many way to solve your task. I will not say about task 1 (spin up B from A) because you already done on it.

Option 1 : Use EC2 Run command to push commands to server B . Flow: A -> EC2 Run Command service -> B -> C No need to push credential (SSH key/password) to server B
Option 2 : Define all you commands in bash shell file, push this shell file to S3. Use User Data of server B to download that file from S3. Flow: A -> S3. B get file from S3. B -> C A -> S3. B get file from S3. B -> C

With above 2 options, you do not need to push any credentials to server B . Server C can be any where you have connection between B to C for downloading task.

Give a try to ansible it can help you to automate this task by creating a playbook

For creating an instance you could use theec2 module , from the doc examples:

# Basic provisioning example
- ec2:
    key_name: mykey
    instance_type: t2.micro
    image: ami-123456
    wait: yes
    group: webserver
    count: 3
    vpc_subnet_id: subnet-29e63245
    assign_public_ip: yes

To download data, the get_url module, example:

- name: Download file with check (md5)
  get_url:
    url: http://example.com/path/file.conf
    dest: /etc/foo.conf
    checksum: md5:66dffb5228a211e61d6d7ef4a86f5758

For modifying files there are multiple modules that can be found in the http://docs.ansible.com/ .

In overall is a tool that can help to automate many things, but some time is required to get the basis, check the Getting started guide , hope it can help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM