简体   繁体   中英

jupyter notebook kernel dies when running dask compute

I have a large csv file (~25GB) with a length of 8529090 and when I try to run the following the kernel dies. Running on a MacBook Pro with 16GB RAM.

import dask.dataframe as dd

ddf = dd.read_csv('data/cleaned_news_data.csv')
ddf = ddf[(ddf.type != 'none')].compute()

Any ideas to work around it?

Thanks for the help.

As you comment above, calling compute turns the result into an in-memory object, so if your result doesn't fit in memory then you're out of luck.

Typically people compute smaller results (for example the inputs to a plot) or they write very large results to disk.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM