我如何在我的 web 应用程序上安全地接受和运行用户代码？

Question

我正在开发一个基于 django 的 web 应用程序，该应用程序将 python 文件作为输入，其中包含一些 function，然后在后端我有一些列表作为参数通过用户的 function 传递，这将生成单个值 883437810.6334 结果将是用于一些进一步的计算。

这是用户文件中的 function 的样子：

def somefunctionname(list):

    ''' some computation performed on list'''

    return float value

目前我使用的方法是将用户文件作为普通文件输入。 然后在我的 views.py 中，我将文件作为模块执行，并使用 eval function 传递参数。下面给出了代码片段。

这里的 modulename 是我从用户那里获取并作为模块导入的 python 文件名

exec("import "+modulename)

result = eval(f"{modulename}.{somefunctionname}(arguments)")

哪个工作得很好。 但我知道这不是安全的方法。

我的问题是，由于我使用的方法不安全，是否还有其他方法可以安全地运行用户文件？ 我知道提议的解决方案不能完全证明，但我可以通过哪些其他方式运行它（比如如果它可以通过 dockerization 解决那么我可以使用 API 的方法或一些外部工具是什么）？ 或者，如果可能的话，有人可以告诉我如何简单地沙盒这个或任何可以帮助我的教程..？

任何参考或资源都会有所帮助。

Answer 1

这是一个重要的问题。 在 python 中，沙盒并不简单。

这是少数几个问题之一，您使用的是哪个版本的 python 解释器。 例如，Jyton 生成 Java 字节码，而 JVM 有自己的安全运行代码的机制。

对于默认解释器 CPython，最初有一些尝试进行限制执行模式，但很久以前就放弃了。

目前，有一个非官方项目RestrictedPython 可以满足您的需求。 它不是一个完整的沙盒，即不会给你限制文件系统访问或其他东西，但对于你的需要它可能就足够了。

基本上那里的人只是以更受限制的方式重写了 python 汇编。

它允许做的是编译一段代码然后执行，所有这些都是在受限模式下进行的。 例如：

from RestrictedPython import safe_builtins, compile_restricted

source_code = """
print('Hello world, but secure')
"""

byte_code = compile_restricted(
    source_code,
    filename='<string>',
    mode='exec'
)
exec(byte_code, {__builtins__ = safe_builtins})

>>> Hello world, but secure

使用 builtins = safe_builtins运行会禁用危险功能，如打开文件、导入或其他任何功能。 还有其他内置函数和其他选项的变体，花一些时间阅读文档，它们非常好。

编辑：

这是您用例的示例

from RestrictedPython import safe_builtins, compile_restricted
from RestrictedPython.Eval import default_guarded_getitem


def execute_user_code(user_code, user_func, *args, **kwargs):
    """ Executed user code in restricted env
        Args:
            user_code(str) - String containing the unsafe code
            user_func(str) - Function inside user_code to execute and return value
            *args, **kwargs - arguments passed to the user function
        Return:
            Return value of the user_func
    """

    def _apply(f, *a, **kw):
        return f(*a, **kw)

    try:
        # This is the variables we allow user code to see. @result will contain return value.
        restricted_locals = {
            "result": None,
            "args": args,
            "kwargs": kwargs,
        }

        # If you want the user to be able to use some of your functions inside his code,
        # you should add this function to this dictionary.
        # By default many standard actions are disabled. Here I add _apply_ to be able to access
        # args and kwargs and _getitem_ to be able to use arrays. Just think before you add
        # something else. I am not saying you shouldn't do it. You should understand what you
        # are doing thats all.
        restricted_globals = {
            "__builtins__": safe_builtins,
            "_getitem_": default_guarded_getitem,
            "_apply_": _apply,
        }

        # Add another line to user code that executes @user_func
        user_code += "\nresult = {0}(*args, **kwargs)".format(user_func)

        # Compile the user code
        byte_code = compile_restricted(user_code, filename="<user_code>", mode="exec")

        # Run it
        exec(byte_code, restricted_globals, restricted_locals)

        # User code has modified result inside restricted_locals. Return it.
        return restricted_locals["result"]

    except SyntaxError as e:
        # Do whaever you want if the user has code that does not compile
        raise
    except Exception as e:
        # The code did something that is not allowed. Add some nasty punishment to the user here.
        raise

现在你有一个 function execute_user_code ，它接收一些不安全的代码作为字符串，这个代码的名称 function，arguments，并返回 function 的返回值和给定的 arguments。

这是一些用户代码的非常愚蠢的例子：

example = """
def test(x, name="Johny"):
    return name + " likes " + str(x*x)
"""
# Lets see how this works
print(execute_user_code(example, "test", 5))
# Result: Johny likes 25

但是当用户代码试图做一些不安全的事情时会发生什么：

malicious_example = """
import sys
print("Now I have the access to your system, muhahahaha")
"""
# Lets see how this works
print(execute_user_code(malicious_example, "test", 5))
# Result - evil plan failed:
#    Traceback (most recent call last):
#  File "restr.py", line 69, in <module>
#    print(execute_user_code(malitious_example, "test", 5))
#  File "restr.py", line 45, in execute_user_code
#    exec(byte_code, restricted_globals, restricted_locals)
#  File "<user_code>", line 2, in <module>
#ImportError: __import__ not found

可能的扩展：

请注意，每次调用 function 时都会编译用户代码。但是，您可能希望编译一次用户代码，然后使用不同的参数执行它。 所以你所要做的就是将字节码保存在byte_code地方，然后每次用一组不同的restricted_locals调用 exec 。

编辑2：

如果你想使用 import，你可以写你自己的 import function 允许只使用你认为安全的模块。 例子：

def _import(name, globals=None, locals=None, fromlist=(), level=0):
    safe_modules = ["math"]
    if name in safe_modules:
       globals[name] = __import__(name, globals, locals, fromlist, level)
    else:
        raise Exception("Don't you even think about it {0}".format(name))

safe_builtins['__import__'] = _import # Must be a part of builtins
restricted_globals = {
    "__builtins__": safe_builtins,
    "_getitem_": default_guarded_getitem,
    "_apply_": _apply,
}

....
i_example = """
import math
def myceil(x):
    return math.ceil(x)
"""
print(execute_user_code(i_example, "myceil", 1.5))

请注意，此示例 import function 非常原始，它不适用于from x import y之类的东西。 您可以在此处查看更复杂的实现。

编辑3

请注意，许多 python 内置功能在 RestrictedPython 中不是开箱即用的，这并不意味着它根本不可用。 您可能需要实施一些 function 才能使其可用。

甚至一些像sum或+=运算符这样明显的东西在受限环境中也不明显。

例如， for循环使用_getiter_ function，您必须自己实现并提供（在全局变量中）。 由于您希望避免无限循环，您可能希望对允许的迭代次数设置一些限制。 下面是一个将迭代次数限制为 100 的示例实现：

MAX_ITER_LEN = 100

class MaxCountIter:
    def __init__(self, dataset, max_count):
        self.i = iter(dataset)
        self.left = max_count

    def __iter__(self):
        return self

    def __next__(self):
        if self.left > 0:
            self.left -= 1
            return next(self.i)
        else:
            raise StopIteration()

def _getiter(ob):
    return MaxCountIter(ob, MAX_ITER_LEN)

....

restricted_globals = {
    "_getiter_": _getiter,

....

for_ex = """
def sum(x):
    y = 0
    for i in range(x):
        y = y + i
    return y
"""

print(execute_user_code(for_ex, "sum", 6))

如果您不想限制循环次数，只需使用身份 function 作为_getiter_ ：

restricted_globals = {
    "_getiter_": labmda x: x,

请注意，简单地限制循环次数并不能保证安全。 首先，循环可以嵌套。 其次，您不能限制while循环的执行次数。 为了确保安全，您必须在超时后执行不安全的代码。

请花点时间阅读文档。

请注意，并非所有内容都已记录（尽管有很多内容）。 你必须学会阅读项目的源代码以获得更高级的东西。 最好的学习方法是尝试运行一些代码，看看缺少什么样的 function，然后查看项目的源代码以了解如何实现它。

编辑4

还有一个问题——受限代码可能会无限循环。 为避免这种情况，代码需要某种超时。

不幸的是，由于您使用的是 django，除非您明确指定，否则它是多线程的，使用 signeals 的简单超时技巧在这里不起作用，您必须使用多处理。

在我看来最简单的方法是使用这个库。 只需向execute_user_code添加一个装饰器，它看起来像这样：

@timeout_decorator.timeout(5, use_signals=False)
def execute_user_code(user_code, user_func, *args, **kwargs):

你完成了。 代码永远不会运行超过 5 秒。 注意 use_signals=False，否则它可能会在 django 中出现一些意外行为。

另请注意，这对资源的消耗相对较大（而且我真的没有找到克服这个问题的方法）。 我的意思是不是真的很重，但它是一个额外的过程产生。 您应该在 web 服务器配置中牢记这一点——允许执行任意用户代码的 api 更容易受到 ddos 攻击。

Answer 2

对于 docker 可以肯定，如果你小心的话，你可以沙箱执行。 您可以限制 CPU 周期，最大 memory，关闭所有网络端口，以对文件系统具有只读访问权限的用户身份运行等等）。

尽管如此，我认为这将是非常复杂的。 对我来说，您不得允许客户执行那样的任意代码。

我会检查生产/解决方案是否尚未完成并使用它。 我在想，有些网站允许您提交一些在服务器上执行的代码（python、java 等）。

我如何在我的 web 应用程序上安全地接受和运行用户代码？

问题描述

2 个解决方案

解决方案1
10 已采纳 2020-07-29 19:26:31

解决方案2
1 2020-07-29 22:56:12

我如何在我的 web 应用程序上安全地接受和运行用户代码？

问题描述

2 个解决方案

解决方案1 10 已采纳 2020-07-29 19:26:31

解决方案2 1 2020-07-29 22:56:12

解决方案1
10 已采纳 2020-07-29 19:26:31

解决方案2
1 2020-07-29 22:56:12