简体   繁体   English

如何在python for循环中创建和删除类实例

[英]How to create and delete class instance in python for loop

I am using a class to create a tensorflow model.我正在使用一个类来创建一个张量流模型。 Within a for loop, I am creating an instance which I must delete at the end of each iteration in order to free up memory.在 for 循环中,我创建了一个实例,我必须在每次迭代结束时删除该实例以释放内存。 Deletion does not work and I am running out of memory.删除不起作用,我的内存不足。 Here is a minimal example of what I tried:这是我尝试过的最小示例:

import numpy as np

class tfModel(self, x):
   def __init__(self, x):
      ....

   def predict(self, x):
      ...
      return x_new


if __name__=="__main__":

   x = np.ones(100)
   for i in range(0, 3):
      model = tfModel(x)
      x = model.predict(x)
      del model

I've read in related questions that "del" only deletes a reference, not the class instance itself.我在相关问题中读到“del”只删除一个引用,而不是类实例本身。 But how can I ensure that all references are deleted and the instance can be garbage collected?但是如何确保所有引用都被删除并且实例可以被垃圾收集?

I think you are talking about two things:我认为你在谈论两件事:

  1. the model itself.模型本身。 I assume your model can fit in your memory.我假设您的模型适合您的记忆。 Otherwise you could not run any prediction.否则,您将无法运行任何预测。
  2. the data.数据。 If data is the problem, you should make a data generator with python so that not all the data exist in the memory at the same time.如果数据是问题,你应该用python做一个数据生成器,这样不是所有的数据都同时存在于内存中。 You should generate each example ( x ) or each batch of examples and feed them into the model to get prediction.您应该生成每个示例 ( x ) 或每批示例并将它们输入模型以获得预测。 The result could be serialized to disk when necessary if your memory cannot hold all results.如果您的内存无法保存所有结果,则可以在必要时将结果序列化到磁盘。

More concretely, something like this:更具体地说,是这样的:


class tfModel(self):
   def __init__(self):
      ....

   def predict(self, x):
      ...
      return x_new

def my_x_generator():
  for x in range(100):
    yield x


THRESHOLD = 16

if __name__=="__main__":

   model = tfModel()
   my_result_buffer = []
   for x in my_x_generator():
      x_pred = model.predict(x)
      my_result_buffer.append(x_pred)
      if len(my_result_buffer) > THRESHOLD:
        ## serialize my_result_buffer to disk
        my_result_buffer = []

Also note that in my sample code above:另请注意,在我上面的示例代码中:

  1. The constructor of tfModel should not depend on x . tfModel的构造tfModel不应依赖于x ( x is removed from __init__ ). x__init__删除)。 Of course, you could use model parameters to initialize your model.当然,您可以使用模型参数来初始化模型。
  2. You should instantiate your model outside of the data loop.您应该在数据循环之外实例化您的模型。 Eg, the model only needs to be instantiated once;例如,模型只需要实例化一次; the same model would be used to do predictions for all examples.将使用相同的模型对所有示例进行预测。

It seems to be a specific tensorflow problem.这似乎是一个特定的张量流问题。 Using the module multiprocessing, one can generate processes within a for loop.使用模块 multiprocessing,可以在 for 循环中生成进程。 The processes are closed when finished and the memory is freed.完成后关闭进程并释放内存。

I found this solution here: Clearing Tensorflow GPU memory after model execution我在这里找到了这个解决方案: Clearing Tensorflow GPU memory after model execution

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM