為什么帶有線程的 Python 垃圾收集不起作用？

Question

當我將線程定義為實例變量時，其他定義的對象不會被垃圾收集。 在此示例中，在循環的每次迭代中都會創建一個數組緩沖區，該緩沖區從不進行垃圾回收：

import array
import gc
import threading

from pympler import muppy

class A():
  def __init__(self):
    self.buffer = array.array('B')
    # Defining a thread keeps array in memory
    self.thread = threading.Thread(target=lambda *_: None)

if __name__ == '__main__':
  for i in range(10):
    a = A()
    # del a  # needed
    gc.collect()
    print('Iteration {}:'.format(i))
    obj = muppy.get_objects()
    print('Array objects {}'.format(len(muppy.filter(obj, Type=array.ArrayType))))
    print('Thread objects {}'.format(len(muppy.filter(obj, Type=threading.Thread))))
    print('Running threads {}'.format(len(threading.enumerate())))

輸出是：

Iteration 0:
Array objects 1
Thread objects 2
Running threads 1
Iteration 1:
Array objects 2
Thread objects 3
Running threads 1
...

線程是否啟動+加入並不重要。 對象緩沖區或self.thread 的顯式刪除允許垃圾收集。 我無法理解這種行為，希望得到一些解釋。 在我的生產代碼中，此功能最終會導致 python 實例的內存不足終止。

Answer 1

obj = muppy.get_objects()是內存中所有對象的列表（因此是對所有對象的引用）。

由於在覆蓋obj變量之前執行了下一次迭代的垃圾收集，因此muppy.get_objects()操作變得累積——我在使用muppy時也陷入了一個陷阱。

簡而言之：

首先gc.collect() ：不影響我們的變量
第一obj = ... ： muppy看到所持線程參考a
第二個gc.collect() ：第一輪創建的A實例不再被a引用但仍然被obj引用 => 不能被垃圾回收
第二個obj = ... ： muppy看到a引用的新線程和obj引用的舊線程
...

使用muppy的經驗法則：在再次調用muppy.get_objects()之前，請務必刪除對對象列表的引用，否則您可能會感到驚訝。

將一個簡單的del obj附加到循環的末尾會導致

Iteration 0:
Array objects 1
Thread objects 2
Running threads 1
Iteration 1:
Array objects 1
Thread objects 2
Running threads 1
Iteration 2:
Array objects 1
Thread objects 2
Running threads 1
...

PS 這與線程無關。

為什么帶有線程的 Python 垃圾收集不起作用？

問題描述

1 個解決方案

解決方案1
0 2021-02-09 23:17:40

為什么帶有線程的 Python 垃圾收集不起作用？

問題描述

1 個解決方案

解決方案1 0 2021-02-09 23:17:40

解決方案1
0 2021-02-09 23:17:40