简体繁体 English

如何跟踪python脚本的内存

[英]How to track memory for a python script

原文 2011-09-09 20:41:30 2 2 python/ memory-management/ stackless

We have a system that only has one interpreter. 我们有一个只有一个解释器的系统。 Many user scripts come through this interpreter. 许多用户脚本都通过此解释器。 We want put a cap on each script's memory usage. 我们希望限制每个脚本的内存使用量。 There is only process, and that process invokes tasklets for each script. 只有进程，该进程为每个脚本调用tasklet。 So since we only have one interpreter and one process, we don't know a way to put a cap on each scripts memory usage. 因为我们只有一个解释器和一个进程，所以我们不知道如何在每个脚本内存使用上设置上限。 What is the best way to do this 做这个的最好方式是什么

2 个解决方案

I don't think that it's possible at all. 我认为这根本不可能。 Your questions implies that the memory used by your tasklets is completly separated, which is probably not the case. 您的问题意味着您的tasklet使用的内存完全分离，可能并非如此。 Python is optimizing small objects like integers. Python正在优化像整数这样的小对象。 As far as I know, for example each 3 in your code is using the same object, which is not a problem, because it is imutable. 据我所知，例如代码中的每个3都使用相同的对象，这不是问题，因为它是可以改变的。 So if two of your tasklets use the same (small?) integer, they are already sharing memory. 因此，如果两个tasklet使用相同的（小？）整数，则它们已经共享内存。 ;-) ;-)

Memory is separated at OS process level. 内存在OS进程级别分离。 There's no easy way to tell to which tasklet and even to which thread does a particular object belong. 没有简单的方法可以告诉哪个tasklet甚至特定对象属于哪个线程。

Also, there's no easy way to add a custom bookkeeping allocator that would analyze which tasklet or thread is is allocating a piece of memory and prevent from allocating too much. 此外，没有简单的方法来添加自定义簿记分配器，该分配器将分析哪个tasklet或线程正在分配一块内存并防止分配过多。 It would also need to plug into garbage-collection code to discount objects which are freed. 它还需要插入垃圾收集代码来折扣被释放的对象。

Unless you're keen to write a custom Python interpreter, using a process per task is your best bet. 除非您热衷于编写自定义Python解释器，否则最好使用每个任务的进程。

You don't even need to kill and respawn the interpreters every time you need to run another script. 每次需要运行另一个脚本时，甚至不需要杀死和重新生成解释器。 Pool several interpreters and only kill the ones that overgrow a certain memory threshold after running a script. 汇集几个解释器，并且只在运行脚本后杀死那些超过某个内存阈值的解释器。 Limit interpreters' memory consumption by means provided by OS if you need. 如果需要，可以通过操作系统提供的方法限制解释器的内存消耗。

If you need to share large amounts of common data between the tasks, use shared memory; 如果需要在任务之间共享大量公共数据，请使用共享内存; for smaller interactions, use sockets (with a messaging level above them as needed). 对于较小的交互，请使用套接字（根据需要使用高于它们的消息传递级别）。

Yes, this might be slower than your current setup. 是的，这可能比您当前的设置慢。 But from your use of Python I suppose that in these scripts you don't do any time-critical computing anyway. 但是从你对Python的使用来看，我认为在这些脚本中你无论如何都不会做任何时间关键的计算。