Download from AWS S3 bucket using boto3 - incorrect timestamp format

Question

I'm using the boto3 library to retrieve a couple of csvs from an S3 bucket:

# Scan s3 verified folder for files
    s3 = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key)

    response = s3.list_objects(Bucket=self.bucket, Prefix='UK_entities/Verified_Matches/')

    # Ignore first file entry in dict as is just the folder name. Returns a list of files
    files = response['Contents'][1:]

    # For any files in /s3/verified/ - download them to local /verified_matches/
    for i in range(len(files)):
        s3.download_file(self.bucket, files[i]['Key'], filepath , os.path.basename(files[i]['Key'])))

The file that gets downloaded has a column match_date which is just a timestamp, and has a value for example 03:44.7 which isn't correct. When I manually download the csv from the bucket, the same value is shown correctly as 2019-08-24 01:03:44.732999

Can anyone highlight what is happening here and point me in the direction of how I might specify how to handle the retrieval of timestamps?

Answer 1

I solved this by specifying the exact format I required prior to uploading to the S3 bucket. Despite being able to download the file from S3 manually with the format being correct, the boto3 library somewhere along the way determines the format itself.

from dateutil.tz import gettz
import datetime as dt

# clust_df['match_date'] = pd.to_datetime('today') --> old version
df['match_date'] = dt.datetime.now(gettz()).isoformat()

Download from AWS S3 bucket using boto3 - incorrect timestamp format

Question

1 answers

solution1
0 ACCPTED 2019-09-10 04:20:51

Download from AWS S3 bucket using boto3 - incorrect timestamp format

Question

1 answers

solution1 0 ACCPTED 2019-09-10 04:20:51

solution1
0 ACCPTED 2019-09-10 04:20:51