简体   繁体   中英

Pandas: How to access in house netapp storage grid file

I have NetApp storage grid(S3) in company infrastructure. I am new to S3. After processing a csv file in Pandas, I need to write this file to S3. The URL for the Storage grid is https://myCompanys3.storage.net and the bucket is 'test_bucket'. I referred to https://stackoverflow.com/a/51777553/13065899

Followed these steps based on other reading on Python/Pandas/S3:

  1. Created folder.aws in my users folder (windows laptop)
  2. Created credentials file with these entries:

'''

[default]
aws_access_key_id=myAccessKey
aws_secret_access_key=mySecretAccessKey

'''

  1. pip install s3fs
  2. Wrote this line of code:

df.to_csv('https://myCompanys3.storage.net/test_bucket/myTest.csv')

Got this error: urllib.error.HTTPError: HTTP Error 403: Forbidden Is the path given in to_csv above the correct way to construct the full path the file?

All examples I have seen so far start with 's3://' and not a full url.

Is s3 a key word and needed for any read/write to storage grid?

Tried

df.to_csv('s3://https://s3.medcity.net://hpg-dl-dev/PandasInvoiceTest.csv', index=False)

Got this error: Invalid bucket name "https:": Bucket name must match the regex "^[a-zA-Z0-9.-_]{1,255}$"

Can someone help me with what I am missing? Perhaps a s3 configuration where I externalize the url?

Thank you in advance.

  1. Use boto3 to establish your connection and download the file
  2. stream the string object into pd.read_csv() using io.StringIO()
import boto3, json
from pathlib import Path
import io

with open(Path.cwd().joinpath("aws-secrets.json")) as f: cfg = json.load(f)
sess = boto3.session.Session(region_name=cfg["REGION_NAME"],
                                 aws_access_key_id=cfg["ACCESS_ID"],
                                 aws_secret_access_key=cfg["ACCESS_KEY"])

pd.read_csv(io.StringIO(
    sess.resource("s3").Object("silicon-myfiles", "elevationdata.csv").get()["Body"].read().decode()
))


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM