简体   繁体   中英

Why is the code not able to find the file specified in the AWS S3 path, when I can find it manually?

I have a bucket called my_bucket and a folder in it called Images . I am trying to read the files (images) inside the Image folder.

file = pd.read_csv(some_csv_file)
X = file.values[:,0]

role = get_execution_role()
bucket='my_bucket'
data_key = 'Images'
data_dir = 's3://{}/{}'.format(bucket, data_key)
s = '/'

for img_name in X:
    seq = (data_dir, img_name)
    img_path = s.join(seq)
    img = imread(img_path)

But it gives the following error:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-20-a273242ed30e> in <module>()
     43     img_path = s.join(seq)
     44     print(img_path)
---> 45     img = imread(img_path)
     46     img = imresize(img, (32, 32))
     47     img = img.astype('float32') # this will help us in later stage

~/anaconda3/envs/python3/lib/python3.6/site-packages/numpy/lib/utils.py in newfunc(*args, **kwds)
     99             """`arrayrange` is deprecated, use `arange` instead!"""
    100             warnings.warn(depdoc, DeprecationWarning, stacklevel=2)
--> 101             return func(*args, **kwds)
    102 
    103         newfunc = _set_function_name(newfunc, old_name)

~/anaconda3/envs/python3/lib/python3.6/site-packages/scipy/misc/pilutil.py in imread(name, flatten, mode)
    162     """
    163 
--> 164     im = Image.open(name)
    165     return fromimage(im, flatten=flatten, mode=mode)
    166 

~/anaconda3/envs/python3/lib/python3.6/site-packages/PIL/Image.py in open(fp, mode)
   2541 
   2542     if filename:
-> 2543         fp = builtins.open(filename, "rb")
   2544         exclusive_fp = True
   2545 

FileNotFoundError: [Errno 2] No such file or directory: 's3://my_bucket/Images/377.jpg'

377.jpg is the first row in X . I checked manually in the S3 storage; this file is present there. So, why am I getting this error, and how to fix it? The only reason I can think of is, maybe the process of specifying the S3 path is wrong - but in the S3 documentation, the process to specify storage is given as 's3://{}/{}'.format(bucket, data_key) . Moreover, in the last line of the error message, the filename is s3://my_bucket/Images/377.jpg , which is the path I navigate manually to locate the file in the bucket.

if the implementation is in python, use boto3.

For example,

import boto3 
s3 = s3_session.client('s3')
object = s3.get_object(Bucket=bucket_names,Key=object_name)
objectContent = object['Body'].read()

Refer : https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.get_object

Check the IAM role attached to your sagemaker notebook instance , you have to give access to s3. Make sure you have given read access to your s3 bucket and all objects in the bucket /*. You don't have to use boto3.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM