简体   繁体   English

识别python函数内全局变量的无意读/写? 例如使用静态分析?

[英]Identify unintentional read/write of global variables inside a python function? For example using static analysis?

One of the things I find frustrating with python is that if I write a function like this:我对 python 感到沮丧的一件事是,如果我编写这样的函数:

def UnintentionalValueChangeOfGlobal(a):
    SomeDict['SomeKey'] = 100 + a
    b = 0.5 * SomeDict['SomeKey']
    return b

And then run it like so:然后像这样运行它:

SomeDict = {}
SomeDict['SomeKey'] = 0
b = UnintentionalValueChangeOfGlobal(10)
print(SomeDict['SomeKey'])

Python will: 1) find and use SomeDict during the function call even though I have forgotten to provide it as an input to the function; Python 将: 1) 在函数调用期间查找并使用SomeDict ,即使我忘记将其作为函数的输入提供; 2) permanently change the value of SomeDict['SomeKey'] even though it is not included in the return statement of the function. 2) 永久更改SomeDict['SomeKey']的值,即使它不包含在函数的 return 语句中。

For me this often leads to variables unintentionally changing values - SomeDict['SomeKey'] in this case becomes 110 after the function is called when the intent was to only manipulate the function output b .对我来说,这通常会导致变量无意中更改值 - 在这种情况下, SomeDict['SomeKey']在调用函数后变为 110,而目的是仅操作函数输出b

In this case I would have preferred that python: 1) crashes with an error inside the function saying that SomeDict is undefined;在这种情况下,我更喜欢 python: 1) 崩溃并在函数内部出现错误,说明SomeDict未定义; 2) under no circumstances permanently changes the value of any variable other than the output b after the function has been called. 2) 在任何情况下都不会在调用函数后永久更改输出b以外的任何变量的值。

I understand that it is not possible to disable the use of globals all together in python, but is there a simple method (a module or an IDE etc.) which can perform static analysis on my python functions and warn me when a function is using and/or changing the value of variables which are not the function's output?我知道不可能在 python 中一起禁用全局变量,但是有没有一种简单的方法(模块或 IDE 等)可以对我的 python 函数执行静态分析并在函数使用时警告我和/或更改不是函数输出的变量的值? Ie, warn me whenever variables are used or manipulated which are not local to the function?即,每当使用或操作不是函数本地的变量时警告我?

One of the reasons Python doesn't provide any obvious and easy way to prevent accessing (undeclared) global names in a function is that in Python everything (well, everything that can be assigned to a name at least) is an object, including functions, classes and modules, so preventing a function to access undeclared global names would make for quite verbose code... And nested scopes (closures etc) don't help either. Python 没有提供任何明显且简单的方法来防止访问(未声明的)函数中的全局名称的原因之一是,在 Python 中,一切(好吧,至少可以分配给名称的一切)都是一个对象,包括函数、类和模块,因此阻止函数访问未声明的全局名称会导致代码非常冗长......嵌套范围(闭包等)也无济于事。

And, of course, despite globals being evils, there ARE still legitimate reasons for mutating a global object sometimes.而且,当然,尽管全局对象是邪恶的,但有时仍然有合理的理由来改变全局对象。 FWIW, even linters (well pylint and pyflakes at least) don't seem to have any option to detect this AFAICT - but you'll have to double-check by yourself, as I might have overlooked it or it might exist as a pylint extension or in another linter. FWIW,即使是短绒(至少是 pylint 和 pyflakes)似乎也没有任何选项来检测这个 AFAICT - 但你必须自己仔细检查,因为我可能忽略了它,或者它可能作为 pylint 存在扩展或另一个短绒。

OTHO, I very seldom had bugs coming from such an issue in 20+ years (I can't remember a single occurrence actually). OTHO,在 20 多年的时间里,我很少遇到来自此类问题的错误(我实际上不记得发生过一次)。 Routinely applying basic good practices - short functions avoiding side effects as much as possible, meaningful names and good naming conventions etc, unittesting at least the critical parts etc - seem to be effective enough to prevent such issues.定期应用基本的良好实践——尽可能避免副作用的短函数、有意义的名称和良好的命名约定等,至少对关键部分进行单元测试等——似乎足以有效地防止此类问题。

One of the points here is that I have a rule about non-callable globals being to be considered as (pseudo) constants, which is denoted by naming them ALL_UPPER.这里的要点之一是我有一条关于不可调用的全局变量被视为(伪)常量的规则,通过将它们命名为 ALL_UPPER 来表示。 This makes it very obvious when you actually either mutate or rebind one...这使得当你真正改变或重新绑定一个时变得非常明显......

As a more general rule: Python is by nature a very dynamic language (heck, you can even change the class of an object at runtime...) and with a "we're all consenting adults" philosophy, so it's indeed "lacking" most of the safety guards you'll find in more "B&D" languages like Java and relies instead on conventions, good practices and plain common sense.作为一个更一般的规则:Python 本质上是一种非常动态的语言(哎呀,你甚至可以在运行时更改对象的类......)并且具有“我们都同意成年人”的理念,所以它确实“缺乏” 您会在 Java 等更多“B&D”语言中找到的大多数安全防护措施都依赖于约定、良好实践和普通常识。

Now, Python is not only vey dynamic but also exposes much of it's inners, so you can certainly (if this doesn't already exists) write a pylint extension that would at least detect global names in function codes (hint: you can access the compiled code of a function object with yourfunc.co_code (py2) or yourfunc.__code__ (py3) and then inspect what names are used in the code).现在,Python是不仅合租动态的,但也暴露了它的内件的多,所以你当然可以(如果尚不存在)写一个pylint的扩展,将至少检测功能代码的全局名称(提示:您可以访问使用yourfunc.co_code (py2) 或yourfunc.__code__ (py3)编译函数对象的代码,然后检查代码中使用了哪些名称)。 But unless you have to deal with a team of sloppy undisciplined devs (in which case you have another issue - there's no technical solutions to stupidity), my very humble opinion is that you're wasting your time.但是,除非您必须与一群草率无纪律的开发人员打交道(在这种情况下,您会遇到另一个问题 - 没有技术解决愚蠢问题),否则我非常谦虚的意见是您在浪费时间。

Ideally I would have wanted the global-checking functionality I'm searching for to be implemented within an IDE and continuously used to assess the use of globals in functions.理想情况下,我希望我正在寻找的全局检查功能在 IDE 中实现,并持续用于评估函数中全局变量的使用。 But since that does not appear to exist I threw together an ad hoc function which takes a python function as input and then looks at the bytecode instructions of the function to see if there are any LOAD_GLOBAL or STORE_GLOBAL instructions present.但由于这似乎不存在,我将一个临时函数放在一起,该函数将 python 函数作为输入,然后查看该函数的字节码指令以查看是否存在任何LOAD_GLOBALSTORE_GLOBAL指令。 If it finds any, it tries to assess the type of the global and compare it to a list of user provided types (int, float, etc..).如果找到,它会尝试评估全局类型并将其与用户提供的类型列表(int、float 等)进行比较。 It then prints out the name of all global variables used by the function.然后打印出该函数使用的所有全局变量的名称。

The solution is far from perfect and quite prone to false positives.该解决方案远非完美,而且很容易出现误报。 For instance, if np.unique(x) is used in a function before numpy has been imported ( import numpy as np ) it will erroneously identify np as a global variable instead of a module.例如,如果np.unique(x)import numpy as np之前在函数中使用( import numpy as np ),它将错误地将np识别为全局变量而不是模块。 It will also not look into nested functions etc.它也不会查看嵌套函数等。

But for simple cases such as the example in this post it seems to work fine.但是对于像这篇文章中的示例这样的简单情况,它似乎工作正常。 I just used it to scan through all the functions in my codebase and it found another global usage that I was unaware of – so at least for me it is useful to have!我只是用它来扫描我代码库中的所有函数,它发现了另一个我不知道的全局用法——所以至少对我来说它很有用!

Here is the function:这是函数:

def CheckAgainstGlobals(function, vartypes):
    """
    Function for checking if another function reads/writes data from/to global
    variables. Only variables of the types contained within 'vartypes' and
    unknown types are included in the output.

     Inputs:
        function - a python function
        vartypes - a list of variable types (int, float, dict,...)
     Example:
        # Define a function
        def testfcn(a):
            a = 1 + b
            return a

        # Check if the function read/writes global variables.    
        CheckAgainstGlobals(testfcn,[int, float, dict, complex, str])

        # Should output:
        >> Global-check of function: testfcn
        >> Loaded global variable: b (of unknown type)
    """
    import dis
    globalsFound = []
    # Disassemble the function's bytecode in a human-readable form.
    bytecode = dis.Bytecode(function)
    # Step through each instruction in the function.
    for instr in bytecode:
        # Check if instruction is to either load or store a global.
        if instr[0] == 'LOAD_GLOBAL' or instr[0] == 'STORE_GLOBAL':
            # Check if its possible to determine the type of the global.
            try:
                type(eval(instr[3]))
                TypeAvailable = True
            except:
                TypeAvailable = False
            """
            Determine if the global variable is being loaded or stored and
            check if 'argval' of the global variable matches any of the 
            vartypes provided as input.
            """
            if instr[0] == 'LOAD_GLOBAL':
                if TypeAvailable:
                    for t in vartypes:
                        if isinstance(eval(instr[3]), t):
                            s = ('Loaded global variable: %s (of type %s)' %(instr[3], t))
                            if s not in globalsFound:
                                globalsFound.append(s)
                else:
                    s = ('Loaded global variable: %s (of unknown type)' %(instr[3]))
                    if s not in globalsFound:
                        globalsFound.append(s)
            if instr[0] == 'STORE_GLOBAL':
                if TypeAvailable:
                    for t in vartypes:
                        if isinstance(eval(instr[3]), t):
                            s = ('Stored global variable: %s (of type %s)' %(instr[3], t))
                            if s not in globalsFound:
                                globalsFound.append(s)
                else:
                    s = ('Stored global variable: %s (of unknown type)' %(instr[3]))
                    if s not in globalsFound:
                        globalsFound.append(s)
    # Print out summary of detected global variable usage.
    if len(globalsFound) == 0:
        print('\nGlobal-check of fcn: %s. No read/writes of global variables were detected.' %(function.__code__.co_name))
    else:
        print('\nGlobal-check of fcn: %s' %(function.__code__.co_name))
        for s in globalsFound:
            print(s)

When used on the function in the example directly after the function has been declared, it will find warn about the usage of the global variable SomeDict but it will not be aware of its type:当在函数声明后直接在示例中的函数上使用时,它会发现关于全局变量SomeDict的使用的警告,但它不会知道它的类型:

def UnintentionalValueChangeOfGlobal(a):
    SomeDict['SomeKey'] = 100 + a
    b = 0.5 * SomeDict['SomeKey']
    return b
# Will find the global, but not know its type.
CheckAgainstGlobals(UnintentionalValueChangeOfGlobal,[int, float, dict, complex, str])

>> Global-check of fcn: UnintentionalValueChangeOfGlobal
>> Loaded global variable: SomeDict (of unknown type)

When used after SomeDict has been defined it also detects that the global is a dict:在定义SomeDict之后使用时,它还会检测到全局是一个字典:

SomeDict = {}
SomeDict['SomeKey'] = 0
b = UnintentionalValueChangeOfGlobal(10)
print(SomeDict['SomeKey'])
# Will find the global, and also see its type.
CheckAgainstGlobals(UnintentionalValueChangeOfGlobal,[int, float, dict, complex, str])

>> Global-check of fcn: UnintentionalValueChangeOfGlobal
>> Loaded global variable: SomeDict (of type <class 'dict'>)

Note: in its current state the function fails to detect that SomeDict['SomeKey'] changes value.注意:在当前状态下,该函数无法检测SomeDict['SomeKey']更改值。 Ie, it only detects the load instruction, not that the previous value of the global is manipulated.即,它只检测加载指令,而不是操作全局的先前值。 That is because the instruction STORE_SUBSCR seems to be used in this case instead of STORE_GLOBAL .这是因为在这种情况下似乎使用指令STORE_SUBSCR而不是STORE_GLOBAL But the use of the global is still detected (since it is being loaded) which is enough for me.但是仍然检测到全局的使用(因为它正在加载),这对我来说已经足够了。

You can check the varible using globals():您可以使用 globals() 检查变量:

def UnintentionalValueChangeOfGlobal(a):

    if 'SomeDict' in globals():
        raise Exception('Var in globals')

    SomeDict['SomeKey'] = 100 + a
    b = 0.5 * SomeDict['SomeKey']
    return b

SomeDict = {}
SomeDict['SomeKey'] = 0
b = UnintentionalValueChangeOfGlobal(10)
print(SomeDict['SomeKey'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM