简体   繁体   中英

Download from AWS S3 bucket using boto3 - incorrect timestamp format

I'm using the boto3 library to retrieve a couple of csvs from an S3 bucket:

# Scan s3 verified folder for files
    s3 = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key)

    response = s3.list_objects(Bucket=self.bucket, Prefix='UK_entities/Verified_Matches/')

    # Ignore first file entry in dict as is just the folder name. Returns a list of files
    files = response['Contents'][1:]

    # For any files in /s3/verified/ - download them to local /verified_matches/
    for i in range(len(files)):
        s3.download_file(self.bucket, files[i]['Key'], filepath , os.path.basename(files[i]['Key'])))

The file that gets downloaded has a column match_date which is just a timestamp, and has a value for example 03:44.7 which isn't correct. When I manually download the csv from the bucket, the same value is shown correctly as 2019-08-24 01:03:44.732999

Can anyone highlight what is happening here and point me in the direction of how I might specify how to handle the retrieval of timestamps?

I solved this by specifying the exact format I required prior to uploading to the S3 bucket. Despite being able to download the file from S3 manually with the format being correct, the boto3 library somewhere along the way determines the format itself.

from dateutil.tz import gettz
import datetime as dt

# clust_df['match_date'] = pd.to_datetime('today') --> old version
df['match_date'] = dt.datetime.now(gettz()).isoformat()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM