简体   繁体   English

在循环迭代结束时删除python pandas dataframe

[英]delete python pandas dataframe in the end of a loop iteration

I am trying to apply the same treatment to bunch of pandas dataframes. 我正在尝试对一堆熊猫数据框应用相同的处理方法。

As these dataframes are big, I don't have enough memory to load them all in the same time. 由于这些数据帧很大,所以我没有足够的内存来同时加载它们。 So I have a list with their respective locations and I want to load and analyze them one by one. 因此,我有一个列表,列出了它们各自的位置,我想一一加载并分析它们。

However, with each iteration, more and more memory is used. 但是,每次迭代都会使用越来越多的内存。 I guess the dataframes are not deleted in the end of the iteration. 我猜数据帧不会在迭代结束时删除。 I don't know how to fix it. 我不知道该如何解决。

Here is my code: 这是我的代码:

folder = 'my/folder'
colors = ['b', 'r']

for i, f in enumerate(glob.glob(folder+'*.txt')):
    print(f)
    df = pd.read_table(f, index_col=False, header=None, delimiter="\t", names=['chr', 'x', 'y'])
    plt.figure(figsize=(32, 8))
    for j, chrm in enumerate(df.chr.unique()):
        plt.plot(df.loc[df.chr == chrm].x, df.loc[df.chr == chrm].y, label=chrm, color=colors[j])
    plt.ylim(0, 200)
    plt.legend()

I must add that I work in Spyder. 我必须补充一点,我在Spyder工作。

So far, I have tried: 到目前为止,我已经尝试过:

  • to add del df and df=None in the end of the loop 在循环末尾添加del dfdf=None
  • to turn the for-loop into a function and to call the map function on it 将for循环转换为函数并在其上调用map函数
  • to use gc.collect() function from the gc package in the end of the loop 在循环结束时使用gc包中的gc.collect()函数

Does somebody know how to delete my df in the end of the iteration or an alternative solution ? 有人知道如何在迭代末尾或其他解决方案中删除我的df吗?

Thanks a lot. 非常感谢。

del statement will just delete the name. del语句只会删除名称。 You will have to manually Garbage collection to delete the data frames from memory. 您将必须手动进行垃圾回收才能从内存中删除数据帧。 Try this: 尝试这个:

import gc gc.collect() 导入gc gc.collect()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM