简体   繁体   English

64位python填充内存,直到计算机冻结没有memerror

[英]64 bit python fills up memory until computer freezes with no memerror

I used to run 32 bit python on a 32-bit OS and whenever i accidentally appended values to an array in an infinite list or tried to load too big of a file, python would just stop with an out of memory error. 我曾经在32位操作系统上运行32位python,每当我意外地将值附加到无限列表中的数组或者尝试加载太大的文件时,python就会因内存不足而停止运行。 However, i now use 64-bit python on a 64-bit OS, and instead of giving an exception, python uses up every last bit of memory and causes my computer to freeze up so i am forced to restart it. 但是,我现在在64位操作系统上使用64位python,而不是给出异常,python使用了最后一点内存并导致我的计算机冻结,所以我被迫重新启动它。

I looked around stack overflow and it doesn't seem as if there is a good way to control memory usage or limit memory usage. 我查看了堆栈溢出,似乎没有一种好方法来控制内存使用或限制内存使用。 For example, this solution: How to set memory limit for thread or process in python? 例如,这个解决方案: 如何在python中设置线程或进程的内存限制? limits the resources python can use, but it would be impractical to paste into every piece of code i want to write. 限制python可以使用的资源,但是粘贴到我想写的每段代码中都是不切实际的。

How can i prevent this from happening? 我怎样才能防止这种情况发生?

I don't know if this will be the solution for anyone else but me, as my case was very specific, but I thought I'd post it here in case someone could use my procedure. 我不知道除了我之外是否还有其他人的解决方案,因为我的案例非常具体,但我想我会在这里发布,以防有​​人可以使用我的程序。

I was having a VERY huge dataset with millions of rows of data. 我有一个非常庞大的数据集,有数百万行数据。 Once I queried this data through a postgreSQL database I used up a lot of my available memory (63,9 GB available in total on a Windows 10 64 bit PC using Python 3.x 64 bit) and for each of my queries I used around 28-40 GB of memory as the rows of data was to be kept in memory while Python did calculations on the data. 一旦我通过postgreSQL数据库查询了这些数据,我就耗尽了大量的可用内存(在使用Python 3.x 64位的Windows 10 64位PC上共计63,9 GB)并且我用过的每个查询都是当Python对数据进行计算时,28-40 GB的内存作为数据行保存在内存中。 I used the psycopg2 module to connect to my postgreSQL. 我使用psycopg2模块连接到我的postgreSQL。

My initial procedure was to perform calculations and then append the result to a list which I would return in my methods. 我的初始程序是执行计算,然后将结果附加到我将在我的方法中返回的列表中。 I quite quickly ended up having too much stored in memory and my PC started freaking out (froze up, logged me out of Windows, display driver stopped responding and etc). 我很快就收到了太多存储在内存中而且我的PC开始吓坏了(冻结,将我从Windows中取出,显示驱动程序停止响应等)。

Therefore I changed my approach using Python Generators . 因此,我使用Python生成器改变了我的方法。 And as I would want to store the data I did calculations on back in my database, I would write each row, as I was done performing calculations on it, to my database. 因为我想存储我在数据库中进行计算的数据,所以我会将每行写入我的数据库,就像我对它进行计算一样。

def fetch_rows(cursor, arraysize=1000):
    while True:
        results = cursor.fetchmany(arraysize)
        if not results:
            break
        for result in results:
            yield result

And with this approach I would do calculations on my yielded result by using my generator: 通过这种方法,我将使用我的生成器对我的结果进行计算:

def main():
    connection_string = "...."
    connection = psycopg2.connect(connection_string)
    cursor = connection.cursor()

    # Using generator
    for row in fecth_rows(cursor):
        # placeholder functions
        result = do_calculations(row) 
        write_to_db(result)

This procedure does however indeed require that you have enough physical RAM to store the data in memory. 但是,此过程确实需要您有足够的物理RAM来将数据存储在内存中。

I hope this helps whomever is out there with same problems. 我希望这可以帮助那些有同样问题的人。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM