简体繁体中英

How do I download data from the internet to an S3 bucket via EC2?

原文 2020-07-02 18:20:20 9 1 python/ amazon-web-services/ amazon-s3/ amazon-ec2/ jupyter-notebook

I want to download several large files from the internet (specifically Reddit monthly submissions from the site PushShift) into an S3 bucket. I am SSHed into an EC2 instance and have a Jupyter notebook running.

Ideally I want to be able to write a Python script in Jupyter notebook of my EC2 instance that downloads the file from the internet and then pushes it to my S3 bucket. How would I go about doing this?

1 answers

It is not possible to "download data from the Internet into Amazon S3".

Amazon S3 is an object storage service. You can upload data to S3 and download data from S3, but it is not possible to tell S3 to download data from some other location and store it .

You will need a program running somewhere that obtains the data from the Internet, then uploads it (creates an object) in Amazon S3. Such a program could be clever enough to 'stream' the data to S3 by downloading content in-memory and then sending it to S3, without having to save to disk in between, but you would need to write that code.

As to 'where' such a program might run, it would be most efficient to run such code either as an AWS Lambda function or on an Amazon EC2 instance that is in the same region as the Amazon S3 bucket.

Since you are running a Jupyter notebook on an Amazon EC2 instance, it would be easiest to download the file to local storage, then upload it to S3.

How do I copy a file from s3 bucket to ec2 instance using lambda function?

How can I navigate into S3 bucket folders from EC2 instance?

Invalid argument type when trying to download specific files from S3 bucket to EC2 using subprocess

Access to Amazon S3 Bucket from EC2 instance

Local access to Amazon S3 Bucket from EC2 instance

How do I transfer files from s3 to my ec2 instance whenever I add a new file to s3?

How to run python file in AWS S3 bucket from EC2?

How to copy files from AWS S3 bucket to EC2 linux machine using AWS Lambda Functions

Parallel/Async Download of S3 data into EC2 in Python?

How do I download a folder from AWS S3 with CloudPathLib from a public bucket?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How do I copy a file from s3 bucket to ec2 instance using lambda function? How can I navigate into S3 bucket folders from EC2 instance? Invalid argument type when trying to download specific files from S3 bucket to EC2 using subprocess Access to Amazon S3 Bucket from EC2 instance Local access to Amazon S3 Bucket from EC2 instance How do I transfer files from s3 to my ec2 instance whenever I add a new file to s3? How to run python file in AWS S3 bucket from EC2? How to copy files from AWS S3 bucket to EC2 linux machine using AWS Lambda Functions Parallel/Async Download of S3 data into EC2 in Python? How do I download a folder from AWS S3 with CloudPathLib from a public bucket?

Related Tags

How do I download data from the internet to an S3 bucket via EC2?

Question

1 answers

solution1 2 2020-07-02 23:28:21

solution1
2 2020-07-02 23:28:21