我如何在我的 web 应用程序上安全地接受和运行用户代码？

Question

I am working on a django based web app that takes python file as input which contains some function, then in backend i have some lists that are passed as parameters through the user's function,which will generate a single value output.The result generated will be used for some further computation.我正在开发一个基于 django 的 web 应用程序，该应用程序将 python 文件作为输入，其中包含一些 function，然后在后端我有一些列表作为参数通过用户的 function 传递，这将生成单个值 883437810.6334 结果将是用于一些进一步的计算。

Here is how the function inside the user's file look like:这是用户文件中的 function 的样子：

def somefunctionname(list):

    ''' some computation performed on list'''

    return float value

At present the approach that i am using is taking user's file as normal file input.目前我使用的方法是将用户文件作为普通文件输入。 Then in my views.py i am executing the file as module and passing the parameters with eval function. Snippet is given below.然后在我的 views.py 中，我将文件作为模块执行，并使用 eval function 传递参数。下面给出了代码片段。

Here modulename is the python file name that i had taken from user and importing as module这里的 modulename 是我从用户那里获取并作为模块导入的 python 文件名

exec("import "+modulename)

result = eval(f"{modulename}.{somefunctionname}(arguments)")

Which is working absolutely fine.哪个工作得很好。 But i know this is not the secured approach.但我知道这不是安全的方法。

My question, Is there any other way through which i can run users file securely as the method that i am using is not secure?我的问题是，由于我使用的方法不安全，是否还有其他方法可以安全地运行用户文件？ I know the proposed solutions can't be full proof but what are the other ways in which i can run this (like if it can be solved with dockerization then what will be the approach or some external tools that i can use with API )?我知道提议的解决方案不能完全证明，但我可以通过哪些其他方式运行它（比如如果它可以通过 dockerization 解决那么我可以使用 API 的方法或一些外部工具是什么）？ Or if possible can somebody tell me how can i simply sandbox this or any tutorial that can help me..?或者，如果可能的话，有人可以告诉我如何简单地沙盒这个或任何可以帮助我的教程..？

Any reference or resource will be helpful.任何参考或资源都会有所帮助。

Answer 1

It is an important question.这是一个重要的问题。 In python sandboxing is not trivial.在 python 中，沙盒并不简单。

It is one of the few cases where the question which version of python interpreter you are using.这是少数几个问题之一，您使用的是哪个版本的 python 解释器。 For example, Jyton generates Java bytecode, and JVM has its own mechanism to run code securely.例如，Jyton 生成 Java 字节码，而 JVM 有自己的安全运行代码的机制。

For CPython, the default interpreter, originally there were some attempts to make a restricted execution mode , that were abandoned long time ago.对于默认解释器 CPython，最初有一些尝试进行限制执行模式，但很久以前就放弃了。

Currently, there is that unofficial project, RestrictedPython that might give you what you need.目前，有一个非官方项目RestrictedPython 可以满足您的需求。 It is not a full sandbox , ie will not give you restricted filesystem access or something, but for you needs it may be just enough.它不是一个完整的沙盒，即不会给你限制文件系统访问或其他东西，但对于你的需要它可能就足够了。

Basically the guys there just rewrote the python compilation in a more restricted way.基本上那里的人只是以更受限制的方式重写了 python 汇编。

What it allows to do is to compile a piece of code and then execute, all in a restricted mode.它允许做的是编译一段代码然后执行，所有这些都是在受限模式下进行的。 For example:例如：

from RestrictedPython import safe_builtins, compile_restricted

source_code = """
print('Hello world, but secure')
"""

byte_code = compile_restricted(
    source_code,
    filename='<string>',
    mode='exec'
)
exec(byte_code, {__builtins__ = safe_builtins})

>>> Hello world, but secure

Running with builtins = safe_builtins disables the dangerous functions like open file, import or whatever.使用 builtins = safe_builtins运行会禁用危险功能，如打开文件、导入或其他任何功能。 There are also other variations of builtins and other options, take some time to read the docs, they are pretty good.还有其他内置函数和其他选项的变体，花一些时间阅读文档，它们非常好。

EDIT:编辑：

Here is an example for you use case这是您用例的示例

from RestrictedPython import safe_builtins, compile_restricted
from RestrictedPython.Eval import default_guarded_getitem


def execute_user_code(user_code, user_func, *args, **kwargs):
    """ Executed user code in restricted env
        Args:
            user_code(str) - String containing the unsafe code
            user_func(str) - Function inside user_code to execute and return value
            *args, **kwargs - arguments passed to the user function
        Return:
            Return value of the user_func
    """

    def _apply(f, *a, **kw):
        return f(*a, **kw)

    try:
        # This is the variables we allow user code to see. @result will contain return value.
        restricted_locals = {
            "result": None,
            "args": args,
            "kwargs": kwargs,
        }

        # If you want the user to be able to use some of your functions inside his code,
        # you should add this function to this dictionary.
        # By default many standard actions are disabled. Here I add _apply_ to be able to access
        # args and kwargs and _getitem_ to be able to use arrays. Just think before you add
        # something else. I am not saying you shouldn't do it. You should understand what you
        # are doing thats all.
        restricted_globals = {
            "__builtins__": safe_builtins,
            "_getitem_": default_guarded_getitem,
            "_apply_": _apply,
        }

        # Add another line to user code that executes @user_func
        user_code += "\nresult = {0}(*args, **kwargs)".format(user_func)

        # Compile the user code
        byte_code = compile_restricted(user_code, filename="<user_code>", mode="exec")

        # Run it
        exec(byte_code, restricted_globals, restricted_locals)

        # User code has modified result inside restricted_locals. Return it.
        return restricted_locals["result"]

    except SyntaxError as e:
        # Do whaever you want if the user has code that does not compile
        raise
    except Exception as e:
        # The code did something that is not allowed. Add some nasty punishment to the user here.
        raise

Now you have a function execute_user_code , that receives some unsafe code as a string, a name of a function from this code, arguments, and returns the return value of the function with the given arguments.现在你有一个 function execute_user_code ，它接收一些不安全的代码作为字符串，这个代码的名称 function，arguments，并返回 function 的返回值和给定的 arguments。

Here is a very stupid example of some user code:这是一些用户代码的非常愚蠢的例子：

example = """
def test(x, name="Johny"):
    return name + " likes " + str(x*x)
"""
# Lets see how this works
print(execute_user_code(example, "test", 5))
# Result: Johny likes 25

But here is what happens when the user code tries to do something unsafe:但是当用户代码试图做一些不安全的事情时会发生什么：

malicious_example = """
import sys
print("Now I have the access to your system, muhahahaha")
"""
# Lets see how this works
print(execute_user_code(malicious_example, "test", 5))
# Result - evil plan failed:
#    Traceback (most recent call last):
#  File "restr.py", line 69, in <module>
#    print(execute_user_code(malitious_example, "test", 5))
#  File "restr.py", line 45, in execute_user_code
#    exec(byte_code, restricted_globals, restricted_locals)
#  File "<user_code>", line 2, in <module>
#ImportError: __import__ not found

Possible extension:可能的扩展：

Pay attention that the user code is compiled on each call to the function. However, it is possible that you would like to compile the user code once, then execute it with different parameters.请注意，每次调用 function 时都会编译用户代码。但是，您可能希望编译一次用户代码，然后使用不同的参数执行它。 So all you have to do is to save the byte_code somewhere, then to call exec with a different set of restricted_locals each time.所以你所要做的就是将字节码保存在byte_code地方，然后每次用一组不同的restricted_locals调用 exec 。

EDIT2:编辑2：

If you want to use import, you can write your own import function that allows to use only modules that you consider safe.如果你想使用 import，你可以写你自己的 import function 允许只使用你认为安全的模块。 Example:例子：

def _import(name, globals=None, locals=None, fromlist=(), level=0):
    safe_modules = ["math"]
    if name in safe_modules:
       globals[name] = __import__(name, globals, locals, fromlist, level)
    else:
        raise Exception("Don't you even think about it {0}".format(name))

safe_builtins['__import__'] = _import # Must be a part of builtins
restricted_globals = {
    "__builtins__": safe_builtins,
    "_getitem_": default_guarded_getitem,
    "_apply_": _apply,
}

....
i_example = """
import math
def myceil(x):
    return math.ceil(x)
"""
print(execute_user_code(i_example, "myceil", 1.5))

Note that this sample import function is VERY primitive, it will not work with stuff like from x import y .请注意，此示例 import function 非常原始，它不适用于from x import y之类的东西。 You can look here for a more complex implementation.您可以在此处查看更复杂的实现。

EDIT3编辑3

Note, that lots of python built in functionality is not available out of the box in RestrictedPython, it does not mean it is not available at all.请注意，许多 python 内置功能在 RestrictedPython 中不是开箱即用的，这并不意味着它根本不可用。 You may need to implement some function for it to become available.您可能需要实施一些 function 才能使其可用。

Even some obvious things like sum or += operator are not obvious in the restricted environment.甚至一些像sum或+=运算符这样明显的东西在受限环境中也不明显。

For example, the for loop uses _getiter_ function that you must implement and provide yourself (in globals).例如， for循环使用_getiter_ function，您必须自己实现并提供（在全局变量中）。 Since you want to avoid infinite loops, you may want to put some limits on the number of iterations allowed.由于您希望避免无限循环，您可能希望对允许的迭代次数设置一些限制。 Here is a sample implementation that limits number of iterations to 100:下面是一个将迭代次数限制为 100 的示例实现：

MAX_ITER_LEN = 100

class MaxCountIter:
    def __init__(self, dataset, max_count):
        self.i = iter(dataset)
        self.left = max_count

    def __iter__(self):
        return self

    def __next__(self):
        if self.left > 0:
            self.left -= 1
            return next(self.i)
        else:
            raise StopIteration()

def _getiter(ob):
    return MaxCountIter(ob, MAX_ITER_LEN)

....

restricted_globals = {
    "_getiter_": _getiter,

....

for_ex = """
def sum(x):
    y = 0
    for i in range(x):
        y = y + i
    return y
"""

print(execute_user_code(for_ex, "sum", 6))

If you don't want to limit loop count, just use identity function as _getiter_ :如果您不想限制循环次数，只需使用身份 function 作为_getiter_ ：

restricted_globals = {
    "_getiter_": labmda x: x,

Note that simply limiting the loop count does not guarantee security.请注意，简单地限制循环次数并不能保证安全。 First, loops can be nested.首先，循环可以嵌套。 Second, you cannot limit the execution count of a while loop.其次，您不能限制while循环的执行次数。 To make it secure, you have to execute unsafe code under some timeout.为了确保安全，您必须在超时后执行不安全的代码。

Please take a moment to read the docs .请花点时间阅读文档。

Note that not everything is documented (although many things are).请注意，并非所有内容都已记录（尽管有很多内容）。 You have to learn to read the project's source code for more advanced things.你必须学会阅读项目的源代码以获得更高级的东西。 Best way to learn is to try and run some code, and to see what kind function is missing, then to see the source code of the project to understand how to implement it.最好的学习方法是尝试运行一些代码，看看缺少什么样的 function，然后查看项目的源代码以了解如何实现它。

EDIT4编辑4

There is still another problem - restricted code may have infinite loops.还有一个问题——受限代码可能会无限循环。 To avoid it, some kind of timeout is required on the code.为避免这种情况，代码需要某种超时。

Unfortunately, since you are using django, that is multi threaded unless you explicitly specify otherwise, simple trick for timeouts using signeals will not work here, you have to use multiprocessing.不幸的是，由于您使用的是 django，除非您明确指定，否则它是多线程的，使用 signeals 的简单超时技巧在这里不起作用，您必须使用多处理。

Easiest way in my opinion - use this library .在我看来最简单的方法是使用这个库。 Simply add a decorator to execute_user_code so it will look like this:只需向execute_user_code添加一个装饰器，它看起来像这样：

@timeout_decorator.timeout(5, use_signals=False)
def execute_user_code(user_code, user_func, *args, **kwargs):

And you are done.你完成了。 The code will never run more than 5 seconds.代码永远不会运行超过 5 秒。 Pay attention to use_signals=False, without this it may have some unexpected behavior in django.注意 use_signals=False，否则它可能会在 django 中出现一些意外行为。

Also note that this is relatively heavy on resources (and I don't really see a way to overcome this).另请注意，这对资源的消耗相对较大（而且我真的没有找到克服这个问题的方法）。 I mean not really crazy heavy, but it is an extra process spawn.我的意思是不是真的很重，但它是一个额外的过程产生。 You should hold that in mind in your web server configuration - the api which allows to execute arbitrary user code is more vulnerable to ddos.您应该在 web 服务器配置中牢记这一点——允许执行任意用户代码的 api 更容易受到 ddos 攻击。

Answer 2

For sure with docker you can sandbox the execution if you are careful.对于 docker 可以肯定，如果你小心的话，你可以沙箱执行。 You can restrict CPU cycles, max memory, close all.network ports, run as a user with read only access to the file system and all).您可以限制 CPU 周期，最大 memory，关闭所有网络端口，以对文件系统具有只读访问权限的用户身份运行等等）。

Still,this would be extremely complex to get it right I think.尽管如此，我认为这将是非常复杂的。 For me you shall not allow a client to execute arbitrar code like that.对我来说，您不得允许客户执行那样的任意代码。

I would be to check if a production/solution isn't already done and use that.我会检查生产/解决方案是否尚未完成并使用它。 I was thinking that some sites allow you to submit some code (python, java, whatever) that is executed on the server.我在想，有些网站允许您提交一些在服务器上执行的代码（python、java 等）。

我如何在我的 web 应用程序上安全地接受和运行用户代码？

问题描述

2 个解决方案

解决方案1
10 已采纳 2020-07-29 19:26:31

解决方案2
1 2020-07-29 22:56:12

我如何在我的 web 应用程序上安全地接受和运行用户代码？

问题描述

2 个解决方案

解决方案1 10 已采纳 2020-07-29 19:26:31

解决方案2 1 2020-07-29 22:56:12

解决方案1
10 已采纳 2020-07-29 19:26:31

解决方案2
1 2020-07-29 22:56:12