简体   繁体   中英

Best practice for parametrizing multiple subfunctions

I often run into a situation where I have a top level function from which I want to be able to modify any of the parameters of multiple subfunctions. I'll express this as the following example:

def plot_data_processing(data_param_1=3, data_param_N=4,
        processing_param_1='a', processing_param_2='b', plotting_param_1='c',
        plotting_param_2=1324):
    data = get_data(data_param_1=data_param_1, data_param_1=data_param_N),
    processed_data = process_data(data, processing_param_1=processing_param_1, processing_param_2=processing_param_2)
    plot_data(processed_data, plotting_param_1=plotting_param_1, plotting_param_2=plotting_param_2)

Now, this is sort of ugly, because I'm forced to redefine all the defaults for my inner functions, and my parameters are one big mess. I suppose I could do the following:

def plot_data_processing(data_kwargs, processing_kwargs, plotting_kwargs):
    data = get_data(**data_kwargs),
    processed_data = process_data(data, **processing_kwargs)
    plot_data(processed_data, **plotting_kwargs)

plot_data_processing(dict(data_param_1=3, data_param_N=4), dict(processing_param_1='a', processing_param_2='b'), dict(plotting_param_1='c',plotting_param_2=1324))

Still, this is not great, because I'm doing this odd practice of passing arguments via a dict, where they only wait for the function to be called to be validated. Seems like a recipe for bugs and unreadable code. Also, I have no freedom to swap the functions called internally for different functions with a similar interface. So I could also go:

def plot_data_processing(data_getter, data_processor, plotter):
    data = data_getter(),
    processed_data = data_processor(data)
    plotter(processed_data)

class DataGetter(object):
    def __init__(self, data_param_1=3, data_param_N=4):
        self.data_param_1 = data_param_1
        self.data_param_N = data_param_N
    def __call__(self):
        # ....
        return data

# ... Also define classes DataProcessor and Plotter

plot_data_processing(DataGetter(data_param_1=3, data_param_N=4), DataProcessor(processing_param_1='a', processing_param_2='b'), Plotter(plotting_param_1='c',plotting_param_2=1324))

However this also seems to involve unnecessary structure and fluff code (self.x = x and all that). I can get around that by using partials (or lambdas):

def plot_data_processing(data_getter, data_processor, plotter):
    data = data_getter(),
    processed_data = data_processor(data)
    plotter(processed_data)

# Called like:
plot_data_processing(
    data_getter = partial(get_data, data_param_1=3, data_param_N=4),
    data_processor = partial(process_data, processing_param_1='a', processing_param_2=3),
    plotter = partial(plot, plotting_param_1='c', plotting_param_2=1342),
    )

But this also seems unsatisfying - because there is no clear "type" of arguments to call the function with - just a partial function which should work when called - makes it more difficult for another programmer who wants to use the function.

So, none of these methods leave me feeling fulfilled or happy. I guess I like partial, but I'd like some way to declare that a partial function obeys some interface.

Does anybody know a better way?

Python 3.5 has a new (optional) type hinting system that might do what you want. It's not checked at run-time by the Python interpreter, but does allow you to make statements about the types of arguments and return function values. A separate static analyzer program like mypy can be run on the code to look for typing errors.

For your plot_data_processing function, I think you'd want to declare things something like this:

from typing import Callable, TypeVar

DataType = TypeVar("DataType")
ProcessedDataType = TypeVar("ProcessedDataType") # could be the same as DataType

def plot_data_processing(data_getter: Callable[[], DataType],
                         data_processor: Callable[[DataType], ProcessedDataType],
                         plotter: Callable[[ProcessedDataType], None]) -> None:
    ...

You might be able to get away with only one DataType rather than two if the data_processer function returns the same the processed data using the same type as the original data. You could also specify those types more specifically (eg with Sequence[float] or whatever, rather than using a TypeVar ) if you didn't need a generic approach.

See PEP 484 and documentation the typing module for more details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM