简体   繁体   中英

Is it better to use Python's lru_cache / memoize feature on public functions or lower-level private functions, or does it matter?

In Python 3, given public & private functions, something like this:

def my_public_function(a, b, c) -> int:
    rv = _my_private_function(a, b, c)
    return rv

def _my_private_function(a, b, c) -> int:
    return a + b + c

If I want to optimize this using functools.lru_cache , am I better to do that on the public function or the private one?

My instinct is the public function so that it's cached as early as possible and returned without touching anything further.

from functools import lru_cache

@lru_cache()
def my_public_function...

Is there any design pattern or other design principle which would suggest using @lru_cache() on _my_private_function instead?

You save time (a function call and 2 adds) by handling this in the public function. In python there isn't really a notion of "private" functions. The convention starting a name with _ to tell others to stay away but there is nothing language specific to it. So it has no affect on any calling function, including lru_cache.

There is no general-purpose answer to this, because there basically aren't any cases where the choice of where to put the cache has no effect. If there were such a case, put it on the public function to skip the extra function call and whatever other setup work the public function has to do. But that's just never going to come up. Instead, it's going to be something like:

  • The helper function is trivial, as in your example, and the cost of the extra function call is on the same scale as the other things you're trying to optimize away. Then you almost certainly want to inline the helper into the public function, and then there's only one place to memoize.
  • The helper function is trivial, but it's called by a bunch of different public functions. Giving each public function its own cache that duplicates the work done by all the other caches is going to waste far lot more resources, and lead to a lot more cache misses, than optimizing out the wrapper stuff will save.
  • The helper is nontrivial, leading to an actual semantic difference between LRU'ing the helper cache vs. the top-level cache. So you have to decide based on which one is correct, not based on a performance issue.
  • The helper is nontrivial, and it doesn't really matter semantically when you cache things—but you regularly call the helper with the same values even for different inputs to the public function. So the gain from memoizing the helper will vastly swamp the microoptimization of memoizing before the extra function call instead of after it.

And so on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM