简体   繁体   中英

How to pass a constant variable to a function

which one is a better design in the following scenario and why?

A:

stop_words = ['com1', 'com2'] 
    
def clean_text(text_tokens, stop_words):
    return [token for token in text_tokens if token not in stop_words]

clean_text(['hello', 'world', 'com1', 'com2'], stop_words)

B:

def clean_text(text_tokens):
    stop_words = ['com1', 'com2']
    return [token for token in text_tokens if token not in stop_words]

clean_text(['hello', 'world', 'com1', 'com2'])

C:

STOP_WORDS = ['com1', 'com2'] 
    
def clean_text(text_tokens):
    return [token for token in text_tokens if token not in STOP_WORDS]

clean_text(['hello', 'world', 'com1', 'com2'])

Added C version based on @MisterMiyagi answer.

Note1: In this context, stop_words is fixed and does not change.

Note2: stop_words can be a small or a very large list.

Middle ground: use a default value for the argument.

def clean_text(text_tokens, stop_words={'com1', 'com2'}):
    return [token for token in text_tokens if token not in stop_words]

clean_text(['hello', 'world', 'com1', 'com2'])

Now the constant {'com1', 'com2'} is only created once (when the function is defined); it doesn't pollute the global scope; and if you end up wanting to, you can optionally pass different stop_words when you call clean_text .

Prefer to create constants at global scope. The global scope is evaluated once , whereas function-local scope is evaluated on each function call .

For very large searches, prefer to use a set due to its O(1) lookup, versus the list O(n) lookup. Values that are intended as constants should be named with ALL_CAPS_NAMES . Functions should directly reference constants iff they are not meant to be replaced.

STOP_WORDS = {'com1', 'com2'}  # constant set of words
    
def clean_text(text_tokens):
    return [token for token in text_tokens if token not in STOP_WORDS]
    #                             directly access constant ^

clean_text(['hello', 'world', 'com1', 'com2'])

For small constants, it may be advantageous to provide them as a literal. Even CPython is able to optimise inline literals to actual constants.

def clean_text(text_tokens):
    return [
        token
        for token in text_tokens
        if token not in {'com1', 'com2'}
        #               ^ compiled as LOAD_CONST (frozenset({'com2', 'com1'}))
    ]

clean_text(['hello', 'world', 'com1', 'com2'])

The current optimiser converts list and tuple literals to tuple constants, and set and frozenset literals to frozenset constants

Scenario A is better if you want to pass different lists for stop_words as a parameter each time you call the function, while Scenario B only test it for ['com1','com2'] ,which mean you only change this list when you edit the function itself.

In conclusion: Scenario A is better to test different lists and pass them as parameters in the function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM