限制Python的语法以安全地执行用户代码。这是一种安全的方法吗？

Question

Original question: 原始问题：

Executing mathematical user code on a python web server, what is the simplest secure way? 在python Web服务器上执行数学用户代码，最简单的安全方法是什么？

I want to be able to run user submitted code on a python webserver. 我希望能够在python webserver上运行用户提交的代码。 The code will be simple and mathematical in nature. 代码本质上是简单和数学的。

As such a small subset of Python is required, my current approach is to whitelist allowable syntax by traversing Python's abstract syntax tree. 由于需要Python的一小部分，我目前的方法是通过遍历Python的抽象语法树将允许的语法列入白名单。 Functions and names get special treatment; 功能和名称得到特殊待遇; only explicitly whitelisted functions are allowed, and only unused names. 只允许明确列入白名单的函数，并且只允许使用未使用的名称。

import ast

allowed_functions = set([
    #math library
    'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh',
    'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf',
    'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod',
    'frexp', 'fsum', 'gamma', 'hypot', 'isinf', 'isnan', 'ldexp',
    'lgamma', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians',
    'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc',
    #builtins
    'abs', 'max', 'min', 'range', 'xrange'
    ])

allowed_node_types = set([
    #Meta
    'Module', 'Assign', 'Expr',
    #Control
    'For', 'If', 'Else',
    #Data
    'Store', 'Load', 'AugAssign', 'Subscript',
    #Datatypes
    'Num', 'Tuple', 'List',
    #Operations
    'BinOp', 'Add', 'Sub', 'Mult', 'Div', 'Mod', 'Compare'
    ])

safe_names = set([
    'True', 'False', 'None'
    ])


class SyntaxChecker(ast.NodeVisitor):

    def check(self, syntax):
        tree = ast.parse(syntax)
        self.visit(tree)

    def visit_Call(self, node):
        if node.func.id not in allowed_functions:
            raise SyntaxError("%s is not an allowed function!"%node.func.id)
        else:
            ast.NodeVisitor.generic_visit(self, node)

    def visit_Name(self, node):
        try:
            eval(node.id)
        except NameError:
            ast.NodeVisitor.generic_visit(self, node)
        else:
            if node.id not in safe_names and node.id not in allowed_functions:
                raise SyntaxError("%s is a reserved name!"%node.id)
            else:
                ast.NodeVisitor.generic_visit(self, node)

    def generic_visit(self, node):
        if type(node).__name__ not in allowed_node_types:
            raise SyntaxError("%s is not allowed!"%type(node).__name__)
        else:
            ast.NodeVisitor.generic_visit(self, node)

if __name__ == '__main__':
    x = SyntaxChecker()
    while True:
        try:
            x.check(raw_input())
        except Exception as e:
            print e

This seems to accept the required syntax, but I am reasonably new to programming and could be missing any number of gaping security holes. 这似乎接受了所需的语法，但我对编程很新，并且可能缺少任何数量巨大的安全漏洞。

So my questions are: Is this secure, is there a better approach, and are there any other precautions I should be taking? 所以我的问题是：这是否安全，是否有更好的方法，是否还有其他预防措施？

Answer 1

Two points I noticed that you could still improve: 有两点我注意到你仍然可以改进：

You should always escape any output that can be generated from some form of user input. 您应该始终转义可以从某种形式的用户输入生成的任何输出。 In your example, the unallowed identifiers get mirrored unmodified back to the output. 在您的示例中，不允许的标识符未经修改地镜像回输出。 This could potentially be exploited, one example being Cross Site Scripting . 这可能被利用，例如Cross Site Scripting 。 Therefore I would additionally escape any such error message to prevent this. 因此，我还会逃避任何此类错误消息以防止这种情况。

Another thing you need to be aware of is Denial-of-Service attacks. 您需要注意的另一件事是拒绝服务攻击。 Imagine someone whips up an Ackermann function and a script to submit it a couple of thousand times to your server... To prevent this, you should timebox the execution time of any code being submitted. 想象一下，有人掀起Ackermann函数和脚本向服务器提交了几千次......为了防止这种情况，你应该将所提交代码的执行时间设置为时间。 This is essential, because this type of "attack" often happens unintentionally - someone managed to produce an infinite loop. 这是必不可少的，因为这种类型的“攻击”经常无意中发生 - 有人设法产生无限循环。

Edit: 编辑：

Finally, I would also recommend to update your Python version to prevent a "hashDoS" attack . 最后，我还建议更新您的Python版本以防止“hashDoS”攻击。

Answer 2

The Openerp's source code contains a safe_eval.py that do a similar thing. Openerp的源代码包含一个类似于safe_eval.py的东西。 But Instead of checking the ast of the source, it restrict the byte code that is allowed to execute. 但是它不是检查源的ast，而是限制允许执行的字节代码。 I think you may also have a look on it :) 我想你也可以看一下:)

Answer 3

Have you looked at pypy's sandboxing features ? 你看过pypy的沙盒功能了吗？ It is reputedly much safer than any CPython sandboxing efforts. 据说它比任何CPython沙盒工作都安全得多。 You can even limit the heap size and cpu execution time to prevent denial of service. 您甚至可以限制堆大小和CPU执行时间以防止拒绝服务。

限制Python的语法以安全地执行用户代码。这是一种安全的方法吗？

问题描述

3 个解决方案

解决方案1
5 已采纳 2012-05-19 00:23:18

解决方案2
2 2012-05-20 05:37:49

解决方案3
1 2012-05-20 03:05:01

限制Python的语法以安全地执行用户代码。 这是一种安全的方法吗？

问题描述

3 个解决方案

解决方案1 5 已采纳 2012-05-19 00:23:18

解决方案2 2 2012-05-20 05:37:49

解决方案3 1 2012-05-20 03:05:01

限制Python的语法以安全地执行用户代码。这是一种安全的方法吗？

解决方案1
5 已采纳 2012-05-19 00:23:18

解决方案2
2 2012-05-20 05:37:49

解决方案3
1 2012-05-20 03:05:01