I am trying two different lines of code that both involve computing combinations of rows of a df with 500k rows.
I think bc of the large # of combinations, the kernal keeps dying. Is there anyway to resolve this ?
Both lines of code that crash are
pd.merge(df.assign(key=0), df.assign(key=0), on='key').drop('key', axis=1)
and
index_comb = list(combinations(df.index, 2))
Both are different ways to achieve same desired df but kernal fails on both.
Would appreciate any help :/
Update: I tried using the code in my terminal and it gives me an error of killed 9: it is using too much memory in terminal as well?
There is no solution here that I know of. Jupyter Notebook simply is not designed to handle huge quantities of data. Compile your code in a terminal, that should work.
In case you run into the same problem when using a terminal look here: Python Killed: 9 when running a code using dictionaries created from 2 csv files
Edit: I ended up finding a way to potentially solve this: Increasing your container size should prevent Jupyter from running out of memory. In order to do so open the settings.cfg
file of jupyter in the home Directory of your Notebook $CHORUS_NOTEBOOK_HOME
The line to edit is this one:
#default memory per container
MEM_LIMIT_PER_CONTAINER=“1g”
The default value should be 1 gb per container, increasing this to 2 or 4 gb should help with memory related crashes. However I am unsure of any implications this has on performance, so be warned!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.