[英]Tutorial: Submitting python script to EC2 using boto3 with data from S3
I am new to AWS and want to run a python work script that is embarrassingly parallel on an EC2-instance (eg c4.4xlarge).我是 AWS 的新手,想在 EC2 实例(例如 c4.4xlarge)上运行一个非常并行的 python工作脚本。
I have gone through questions on the topic, but have not found a high-level answer to the steps I need to take.我已经解决了有关该主题的问题,但没有找到我需要采取的步骤的高级答案。 I have AWS credentials and have boto3 installed on my laptop's python 2.
我有 AWS 凭证并且在我的笔记本电脑的 python 2 上安装了 boto3。
How do I structure a python submission script that:我如何构建一个 python提交脚本:
In addition, within my python work script, how do I save the results of the work script back to S3?此外,在我的 python工作脚本中,如何将工作脚本的结果保存回 S3?
Finally, how do I ensure that the python version that I access via AWS has all the packages that are needed to successfully run my python work script?最后,如何确保我通过 AWS 访问的 python 版本具有成功运行我的 python工作脚本所需的所有包?
Sorry if the question is too high-level and for any conceptual mistakes.对不起,如果问题太高级并且有任何概念错误。 Thank you for any pointers!
感谢您的任何指点!
To achieve this I would like to suggest more details to your current flow:为了实现这一点,我想为您当前的流程提供更多细节建议:
In the submission script:在提交脚本中:
In the EC2 instance:在 EC2 实例中:
There are 2 simple ways to run commands on an EC2 instance, SSH or use the user-data attribute.有两种简单的方法可以在 EC2 实例上运行命令,SSH 或使用 user-data 属性。 For simplicity, and for your current use case, I would recommend using the user-data method.
为简单起见,对于您当前的用例,我建议使用 user-data 方法。
First, you need to create an EC2-InstanceProfile with permissions to download/upload to the S3 bucket.首先,您需要创建一个具有下载/上传到 S3 存储桶的权限的EC2-InstanceProfile 。 Then you can create an EC2, install any python or pip packages and register it as an AMI .
然后您可以创建一个 EC2,安装任何 python 或 pip 包并将其注册为AMI 。
Here is some reference code: Note this code is in python3 and suitable only for Windows machines.下面是一些参考代码: 注意这段代码是在 python3 中的,只适用于 Windows 机器。
submission.py:提交.py:
import boto3
s3_client = boto3.client('s3')
ec2 = boto3.resource('ec2')
deps = {
'remote' : [
"/path/to/s3-bucket/obj.txt"
],
'local' : [
"/path/to/local-directory/obj.txt"
]
}
for remote, local in zip(deps['remote'], deps['local']):
s3_client.upload_file(local, bucket_name, remote)
user_data = f"""<powershell>
cd {path_to_instance_worker_dir}; python {path_to_instance_worker_script}
</powershell>
"""
instance = ec2.create_instances(
MinCount=1,
MaxCount=1,
ImageId=image_id,
InstanceType=your_ec2_type,
KeyName=your_key_name,
IamInstanceProfile={
'Name': instance_profile_name
},
SecurityGroupIds=[
instance_security_group,
],
UserData=user_data
)
instance_worker:实例工作者:
import boto3
s3_client = boto3.client('s3')
deps = {
'remote' : [
"/path/to/s3-bucket/obj.txt"
],
'local' : [
"/path/to/local-directory/obj.txt"
]
}
for remote, local in zip(deps['remote'], deps['local']):
s3_client.download_file(bucket_name, remote, local)
result = do_work()
# write results to file
s3_client.upload_file(result_file, bucket_name, result_remote)
# Get the instance ID from inside (This is only for Windows machines)
p = subprocess.Popen(["powershell.exe", "(Invoke-WebRequest -Uri 'http://169.254.169.254/latest/meta-data/instance-id').Content"])
out = p.communicate()[0]
instance_id = str(out.strip().decode('ascii'))
ec2_client.terminate_instances(InstanceIds=[instance_id, ])
In this code, I terminate the instance from within, in order to do that you must first obtain the instnace_id, have a look here for more references.在这段代码中,我从内部终止了实例,为此您必须首先获取 instnace_id,请查看此处以获取更多参考。
Finally, how do I ensure that the python version that I access via AWS has all the packages that are needed to successfully run my python work script?
最后,如何确保我通过 AWS 访问的 python 版本具有成功运行我的 python 工作脚本所需的所有包?
In theory, you can use the user data to run any scripts or CLI commands you would like, including installing python and pip dependencies, but if it's too complicated/heavy to install, I would suggest you build an image and launch from it, as mentioned before.理论上,您可以使用用户数据运行您想要的任何脚本或 CLI 命令,包括安装 python 和 pip 依赖项,但如果安装太复杂/繁重,我建议您构建一个映像并从中启动,如之前提到过。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.