I am working with some big pandas
DataFrame. I realised that the memory usage (as monitored in Win Task Manager
) didn't decrease when assigning a subset of one DataFrame to itself. For example, if there is a big DataFrame df which takes roughly 10GB
memory, after doing operations like below:
df = df[df['v1']==1]
or even
df = df.loc[0:10]
The memory usage line in Task Manager wouldn't change at all.
I have searched a while and read some posts here and there - but couldn't find a understandable reason or solution. Any help are appreciated!
Is there a way to reduce the memory usage? I read some posts suggesting reading less data in the beginning, but this solution seems to be quite difficult in my case.
One solution that worked for me is deleting each column/row one by one inplace.
for x in range(0,10):
df.drop(x, inplace=True, axis=0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.