如何在python for循环中创建和删除类实例

Question

I am using a class to create a tensorflow model.我正在使用一个类来创建一个张量流模型。 Within a for loop, I am creating an instance which I must delete at the end of each iteration in order to free up memory.在 for 循环中，我创建了一个实例，我必须在每次迭代结束时删除该实例以释放内存。 Deletion does not work and I am running out of memory.删除不起作用，我的内存不足。 Here is a minimal example of what I tried:这是我尝试过的最小示例：

import numpy as np

class tfModel(self, x):
   def __init__(self, x):
      ....

   def predict(self, x):
      ...
      return x_new


if __name__=="__main__":

   x = np.ones(100)
   for i in range(0, 3):
      model = tfModel(x)
      x = model.predict(x)
      del model

I've read in related questions that "del" only deletes a reference, not the class instance itself.我在相关问题中读到“del”只删除一个引用，而不是类实例本身。 But how can I ensure that all references are deleted and the instance can be garbage collected?但是如何确保所有引用都被删除并且实例可以被垃圾收集？

Answer 1

I think you are talking about two things:我认为你在谈论两件事：

the model itself.模型本身。 I assume your model can fit in your memory.我假设您的模型适合您的记忆。 Otherwise you could not run any prediction.否则，您将无法运行任何预测。
the data.数据。 If data is the problem, you should make a data generator with python so that not all the data exist in the memory at the same time.如果数据是问题，你应该用python做一个数据生成器，这样不是所有的数据都同时存在于内存中。 You should generate each example ( x ) or each batch of examples and feed them into the model to get prediction.您应该生成每个示例 ( x ) 或每批示例并将它们输入模型以获得预测。 The result could be serialized to disk when necessary if your memory cannot hold all results.如果您的内存无法保存所有结果，则可以在必要时将结果序列化到磁盘。

More concretely, something like this:更具体地说，是这样的：


class tfModel(self):
   def __init__(self):
      ....

   def predict(self, x):
      ...
      return x_new

def my_x_generator():
  for x in range(100):
    yield x


THRESHOLD = 16

if __name__=="__main__":

   model = tfModel()
   my_result_buffer = []
   for x in my_x_generator():
      x_pred = model.predict(x)
      my_result_buffer.append(x_pred)
      if len(my_result_buffer) > THRESHOLD:
        ## serialize my_result_buffer to disk
        my_result_buffer = []

Also note that in my sample code above:另请注意，在我上面的示例代码中：

The constructor of tfModel should not depend on x . tfModel的构造tfModel不应依赖于x 。 ( x is removed from __init__ ). （ x从__init__删除）。 Of course, you could use model parameters to initialize your model.当然，您可以使用模型参数来初始化模型。
You should instantiate your model outside of the data loop.您应该在数据循环之外实例化您的模型。 Eg, the model only needs to be instantiated once;例如，模型只需要实例化一次； the same model would be used to do predictions for all examples.将使用相同的模型对所有示例进行预测。

Answer 2

It seems to be a specific tensorflow problem.这似乎是一个特定的张量流问题。 Using the module multiprocessing, one can generate processes within a for loop.使用模块 multiprocessing，可以在 for 循环中生成进程。 The processes are closed when finished and the memory is freed.完成后关闭进程并释放内存。

I found this solution here: Clearing Tensorflow GPU memory after model execution我在这里找到了这个解决方案： Clearing Tensorflow GPU memory after model execution

如何在python for循环中创建和删除类实例

问题描述

2 个解决方案

解决方案1
0 2019-12-06 23:21:46

解决方案2
0 已采纳 2019-12-10 09:24:27

如何在python for循环中创建和删除类实例

问题描述

2 个解决方案

解决方案1 0 2019-12-06 23:21:46

解决方案2 0 已采纳 2019-12-10 09:24:27

解决方案1
0 2019-12-06 23:21:46

解决方案2
0 已采纳 2019-12-10 09:24:27