简体   繁体   English

Python - function RAM memory 效率

[英]Python - function RAM memory efficiency

I have this function:我有这个 function:

def preprocess(data_input):
    data = data_input.copy()
    
    data = # some code
    data = # some code
    data = # some code
    
    return(data)

I have a dataframe df that comes as an input to this function after a long quantity of jupyter notebook cells, like:我有一个 dataframe df作为这个 function 在大量 jupyter 笔记本单元之后的输入,例如:

preprocess(df)

With the current function, I assign the changes of the local df (inside the function) to the global df (outside the function) with df = preprocess(df) .使用当前的 function,我使用df = preprocess(df)将本地df (函数内部)的更改分配给全局df (函数外部)。 I do this thing in this way in order to avoid running the whole notebook each time I find an error in the function.我这样做是为了避免每次在 function 中发现错误时运行整个笔记本。 So when I know the function is running correctly (after infinite validations and bugs corrections), I just change preprocess(df) for df = preprocess(df) .因此,当我知道 function 运行正常(在无限验证和错误更正之后)时,我只需将preprocess(df)更改为df = preprocess(df)

This answer says that the best case is to use the .copy() method (The answer is a bit old, from 2015).这个答案说最好的情况是使用.copy()方法(答案有点旧,从 2015 年开始)。 However, I'm wondering about memory efficiency.但是,我想知道 memory 的效率。 If I have a 4GB data frame, only ONE copy would make me to use 8GB RAM (the original + the copy).如果我有一个 4GB 的数据框,那么只有一份副本会让我使用 8GB RAM(原件 + 副本)。 So, is there any way of replace the .copy() method with a more efficient alternative?那么,有没有办法用更有效的替代方法替换.copy()方法?

In addition, You are welcome if you have a better suggestion for avoiding running the whole notebook when debugging a function, I would appreciate it!另外,如果您在调试function时有更好的避免运行整个笔记本的建议,我将不胜感激!

EDIT编辑

I forgot to mention if I don't use .copy() method, I always receive a SettingWithCopyWarning .我忘了提到如果我不使用.copy()方法,我总是会收到SettingWithCopyWarning

If I understand correctly, You can probably put that function inside another module so you can test it without having to run your whole code to read that function definition.如果我理解正确,您可以将 function 放在另一个模块中,这样您就可以对其进行测试,而无需运行整个代码来阅读 function 定义。

Having your function inside a module would allow you to make a small data set to design a test case that lets you test the most common use cases, you can even create a whole new script to test your function.将 function 放在一个模块中可以让您创建一个小数据集来设计一个测试用例,让您测试最常见的用例,您甚至可以创建一个全新的脚本来测试您的 function。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM