issue with reading csv file from AWS S3 with boto3

Question

I have a csv file with the following columns:

Name Adress/1 Adress/2 City State

When I try to read this csv file from local disk I have no issue.

But when I try to read it from S3 with the below code I get error when I use io.StringIO. When I use io.BytesIO each record displays as one column. Though the file is a ',' separated some column do contain '/n' or '/t' in it. I believe these causing the issue.

I used AWS Wrangler with no issue. But my requirement is to read this csv file with boto3

import pandas as pd
import boto3

s3 = boto3.resource('s3', aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
my_bucket = s3.Bucket(AWS_S3_BUCKET)
csv_obj=my_bucket.Object(key=key).get().get('Body').read().decode('utf16')
data= io.BytesIO(csv_obj) #io.StringIO(csv_obj)
sdf = pd.read_csv(data,delimiter=sep,names=cols, header=None,skiprows=1)
print(sdf)

Any suggestion please?

Answer 1

try get_object():

obj = boto3.client('s3').get_object(Bucket=AWS_S3_BUCKET, Key=key)
data = io.StringIO(obj['Body'].read().decode('utf-8'))

issue with reading csv file from AWS S3 with boto3

Question

1 answers

solution1
0 2021-08-16 03:57:50

issue with reading csv file from AWS S3 with boto3

Question

1 answers

solution1 0 2021-08-16 03:57:50

solution1
0 2021-08-16 03:57:50