AWS Jupyter Notebook EC2 Instance: Getting error while reading pandas csv from S3

Question

While reading a CSV from S3, the kernel is restarting with the below pop up:

Kernel Restarting
The kernel appears to have died. It will restart automatically

Below is the code snippet:

import boto3
import pandas as pd
from boto.s3.connection import S3Connection

YOUR_ACCESS_KEY='******'
YOUR_SECRET_KEY='******'
YOUR_BUCKET='******'

client = boto3.client('s3',aws_access_key_id=YOUR_ACCESS_KEY, aws_secret_access_key=YOUR_SECRET_KEY)
client.download_file(YOUR_BUCKET, 'test.csv','test.csv')

Error is thrown from the below line :

test_df = pd.read_csv('test.csv')

But I can access other files such as a sample text file:

client.download_file(YOUR_BUCKET, 'sample.txt','sample.txt')
print(open('sample.txt').read())

I assumed this error was because of the huge size of the CSV file, but reading a 5MB CSV file is giving the same error.

Answer 1

It appears to be the bug with pyTorch.

https://github.com/jupyter/notebook/issues/2784

Alternatives and multiple solutions discussed around there, the ticket is still open.

Hope it helps.

AWS Jupyter Notebook EC2 Instance: Getting error while reading pandas csv from S3

Question

1 answers

solution1
0 2017-10-01 04:12:47

AWS Jupyter Notebook EC2 Instance: Getting error while reading pandas csv from S3

Question

1 answers

solution1 0 2017-10-01 04:12:47

solution1
0 2017-10-01 04:12:47