简体   繁体   中英

AWS Jupyter Notebook EC2 Instance: Getting error while reading pandas csv from S3

While reading a CSV from S3, the kernel is restarting with the below pop up:

Kernel Restarting
The kernel appears to have died. It will restart automatically

Below is the code snippet:

import boto3
import pandas as pd
from boto.s3.connection import S3Connection

YOUR_ACCESS_KEY='******'
YOUR_SECRET_KEY='******'
YOUR_BUCKET='******'

client = boto3.client('s3',aws_access_key_id=YOUR_ACCESS_KEY, aws_secret_access_key=YOUR_SECRET_KEY)
client.download_file(YOUR_BUCKET, 'test.csv','test.csv')

Error is thrown from the below line :

test_df = pd.read_csv('test.csv')

But I can access other files such as a sample text file:

client.download_file(YOUR_BUCKET, 'sample.txt','sample.txt')
print(open('sample.txt').read())

I assumed this error was because of the huge size of the CSV file, but reading a 5MB CSV file is giving the same error.

It appears to be the bug with pyTorch.

https://github.com/jupyter/notebook/issues/2784

Alternatives and multiple solutions discussed around there, the ticket is still open.

Hope it helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM