简体   繁体   English

Python Google App Engine 项目中发生内存泄漏。 任何有效的方法来编写我的操作?

[英]Memory leak happened in Python Google App Engine project. Any efficient way to write my operation?

I have a GAE project written in Python.我有一个用 Python 编写的 GAE 项目。 I made a cron to execute a batch operation.我做了一个 cron 来执行批处理操作。 But it hit a soft private memory limit of F1 instance, which is 124MB after a few iterations.但它达到了 F1 实例的软私有内存限制,经过几次迭代后达到 124MB。 Could anyone help me to write this code more efficiently, hopefully within 124MB.任何人都可以帮助我更有效地编写此代码,希望在 124MB 内。 len(people) should be less than 500. len(people) 应该小于 500。

def cron():
    q = Account.all().filter('role =', 1)
    people = [e for e in q]
    for p in people:
        s = Schedule.available(p)
        m = ScheduleMapper(s).as_dict()
        memcache.set('key_for_%s' % p.key(), m)

This is dev server and I don't want to upgrade my instance class.这是开发服务器,我不想升级我的实例类。 Plus, I want to avoid using third party libraries, such as numpy and pandas.另外,我想避免使用第三方库,例如 numpy 和 pandas。

I added a garbage collection in the last line of for loop.我在 for 循环的最后一行添加了垃圾回收。 But it doesn't seem to be working.但它似乎不起作用。

del s
m.clear()
import gc
gc.collect()

To see if it's even possible to fit it into the memory footprint you want modify your query to get a single entity and check if you can execute successfully the for loop for that one entity.要查看是否有可能将其放入内存占用中,您需要修改查询以获取单个实体并检查您是否可以成功执行该实体的for循环。 Or just add a break at the end of the for loop :)或者只是在for循环的末尾添加一个break :)

If that doesn't work you need to upgrade your instance class.如果这不起作用,您需要升级您的实例类。

If the experiment works then you can use split the work using Query Cursors into multiple push queue tasks, each processing only one entity or just a few of them.如果实验成功,那么您可以使用Query Cursors将工作拆分为多个推送队列任务,每个任务仅处理一个实体或几个实体。

Maybe take a look at Google appengine: Task queue performance for a discussion about splitting the work in multiple tasks (though the reason for splitting in that case was exceeding the request deadline, not the memory limit).也许看看Google appengine: Task queue performance讨论将工作拆分为多个任务(尽管在这种情况下拆分的原因是超过请求截止日期,而不是内存限制)。

Note that even when using multiple tasks it's still possible to hit the memory limit (see App Engine Deferred: Tracking Down Memory Leaks ), but at least the work would get done even if a particular instance is restarted (tasks are retried by default).请注意,即使在使用多个任务时,仍有可能达到内存限制(请参阅App Engine Deferred: Tracking Down Memory Leaks ),但即使重新启动特定实例(默认情况下会重试任务),至少可以完成工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM