简体   繁体   English

在多次调用的函数Python 3中执行一次昂贵的操作

[英]Expensive operation done once in a function that is called many times, Python 3

I have a long list of groups in json and I want a little utility: 我在json中有很长的组列表,我想要一个小实用程序:

def verify_group(group_id):
    group_ids = set()
    for grp in groups:
        group_ids.add(grp.get("pk"))
    return group_id in group_ids

The obvious approach is to load the set outside the function, or otherwise declare a global -- but let's assume I don't want a global variable. 一种明显的方法是将集合加载到函数外部,或者以其他方式声明全局变量-但让我们假设我不需要全局变量。

In statically typed languages I can say that the set is static and, I believe that will accomplish my aim. 在静态类型语言中,我可以说该集合是静态的,并且我相信这将实现我的目标。 How would one do something similar in python? 在python中如何做类似的事情? That is : the first call initializes the set, group_ids , subsequent calls use the set initialized in the first call. 即:第一个调用初始化集合group_ids ,后续调用使用在第一个调用中初始化的集合。

BTW, when I use the profilestats package to profile this little code snippet, I see these frightening results: 顺便说一句,当我使用profilestats包来分析这个小代码片段时,我看到了这些令人恐惧的结果:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      833    0.613    0.001    1.059    0.001 verify_users_groups.py:25(verify_group)
  2558976    0.253    0.000    0.253    0.000 {method 'get' of 'dict' objects}
  2558976    0.193    0.000    0.193    0.000 {method 'add' of 'set' objects}

I tried adding functools.lru_cache -- but that type of optimization doesn't address my present question -- how can I load the set group_ids once inside a def block? 我尝试添加functools.lru_cache-但是这种类型的优化不能解决我目前的问题-如何 def块中一次加载set group_ids

Thank you for your time. 感谢您的时间。

There isn't an equivalent of static , however you can achieve the same effect in different ways: 没有等效的static ,但是您可以通过不同的方式实现相同的效果:

One way is to abuse the infamous mutable default argument: 一种方法是滥用臭名昭著的可变默认参数:

def verify_group(group_id, group_ids=set()):
    if not group_ids:
        group_ids.update(grp.get("pk") for grp in groups)
    return group_id in group_ids

This however allows the caller to change that value (which may be a feature or a bug for you). 但是,这允许调用者更改该值(这可能是您的功能或错误)。

I usually prefer using a closure: 我通常更喜欢使用闭包:

def make_group_verifier():
    group_ids = {grp.get("pk") for grp in groups}
    def verify_group(group_id):
        # nonlocal group_ids # if you need to change its value
        return group_id in group_ids
    return verify_group

verify_group = make_group_verifier()

Python is an OOP language. Python是一种OOP语言。 What you describe is an instance method. 您描述的是一个实例方法。 Initialize the class with the set and call the method on the instance. 用集合初始化类,然后在实例上调用方法。

class GroupVerifier:
    def __init__(self):
        self.group_ids = {grp.get("pk") for grp in groups}
    def verify(self, group_id):
        # could be __call__
        return group_id in self.group_ids

I'd also like to add that it depends by the API design. 我还想补充一点,它取决于API设计。 You could let the take the responsibility of pre-computing and providing the value if they want performance. 如果他们需要性能,您可以让他们负责预先计算并提供价值。 This is the choice taken by, for example, random.choices . 这是例如random.choices选择的random.choices The cum_weights parameter isn't necessary but it allows the user to remove the cost of computing that array for every call in performance critical code. cum_weights参数不是必需的,但它允许用户消除性能关键代码中每次调用时计算该数组的开销。 So instead of having a mutable argument you use None as default and compute that set only if the value passed is None otherwise you assume the caller did the work for you. 因此,不要使用可变参数,而应将None用作默认值,并仅在传递的值为None时才计算该设置,否则您将假定调用者为您完成了工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python递归函数调用次数过多 - Python recursive function called too many times for循环声明中的函数是否会在Python中多次调用? - Is a function in a for loop declaration will be called for many times in Python? Python线程:我缺少什么? (task_done()调用次数太多) - Python threading: What am I missing? (task_done() called too many times) Python:多次创建一个小列表有多贵? - Python: How expensive is to create a small list many times? 如何在Python中放入“如果函数已被多次调用”? - How to put “if the function has been called this many times” in Python? 计算递归调用 function 的次数(Python) - Counting how many times a function is called recursively (Python) 如何使 PYTHON 中的 try 函数只执行一次而 except 执行多次? - How to make try function in PYTHON execute only once while except executes many times? 编写一个函数来计算python中的数字总和,该函数多次被称为sum(1)(2)(3) - Write a function to calculate the sum of numbers in python, which is called many times as sum (1) (2) (3) 跟踪在 Python 中调用了递归 function 的次数 - Keep track of how many times a recursive function has been called in Python 当多次调用具有许多参数的函数时,python中减少代码重复的好方法是什么? - What is a good way in python to reduce code repetition when a function with many parameters is called multiple times?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM