简体   繁体   中英

Pandas unable to allocate GiB for an array with shape X and data type Y

I am manipulating time series data in a dataframe (df1) that has a bunch of input columns, 300 period columns, and 839826 rows.

If I try to only manipulate the 839826 x 300 section of this dataframe by multiplying it by a similarly shaped section of a different dataframe (df2):

df1.iloc[:, 0:301] = df1.iloc[:, 0:301] * df2.iloc[:, 0:301]

I get this error:

Unable to allocate 1.88 GiB for an array with shape (301, 839826) and data type float64

I found the answer to a similar question, but the solution was for Linux and I am working on Windows. I have read online I should use Dask, but I am not sure about how to implement that in here, or whether it's even the right solution to go for.

The line

df1.iloc[:, 0:301] = df1.iloc[:, 0:301] * df2.iloc[:, 0:301]

first allocates a temporary array/dataframe from the result of the multiplication, before assigning it into the output. You can prevent this by doing only in-place operations:

df1.iloc[:, 0:301] = df1.iloc[:, 0:301] 
df1.iloc[:, 0:301] *= df2.iloc[:, 0:301]

This might get you over your immediate hurdle - but indeed do investigate Dask in case you are facing this kind of situation a lot.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM