In Python in GAE, what is the best way to limit the risk of executing untrusted code?

Question

I would like to enable students to submit python code solutions to a few simple python problems. My applicatoin will be running in GAE. How can I limit the risk from malicios code that is sumitted? I realize that this is a hard problem and I have read related Stackoverflow and other posts on the subject. I am curious if the restrictions aleady in place in the GAE environment make it simpler to limit damage that untrusted code could inflict. Is it possible to simply scan the submitted code for a few restricted keywords (exec, import, etc.) and then ensure the code only runs for less than a fixed amount of time, or is it still difficult to sandbox untrusted code even in the resticted GAE environment? For example:

# Import and execute untrusted code in GAE
untrustedCode = """#Untrusted code from students."""

class TestSpace(object):pass
  testspace = TestSpace()

try:
  #Check the untrusted code somehow and throw and exception.
except:
   print "Code attempted to import or access network"


try:
    # exec code in a new namespace (Thanks Alex Martelli)
    # limit runtime somehow
    exec untrustedCode in vars(testspace)
except:
    print "Code took more than x seconds to run"

Answer 1

@mjv's smiley comment is actually spot-on: make sure the submitter IS identified and associated with the code in question (which presumably is going to be sent to a task queue), and log any diagnostics caused by an individual's submissions.

Beyond that, you can indeed prepare a test-space that's more restrictive (thanks for the acknowledgment;-) including a special ' builtin ' that has all you want the students to be able to use and redefines __import__ &c. That, plus a token pass to forbid exec, eval, import, __subclasses__ , __bases__ , __mro__ , ..., gets you closer. A totally secure sandbox in a GAE environment however is a real challenge, unless you can whitelist a tiny subset of the language that the students are allowed.

So I would suggest a layered approach: the sandbox GAE app in which the students upload and execute their code has essentially no persistent layer to worry about; rather, it "persists" by sending urlfetch requests to ANOTHER app, which never runs any untrusted code and is able to vet each request very critically. Default-denial with whitelisting is still the holy grail, but with such an extra layer for security you may be able to afford a default-acceptance with blacklisting...

Answer 2

You really can't sandbox Python code inside App Engine with any degree of certainty. Alex's idea of logging who's running what is a good one, but if the user manages to break out of the sandbox, they can erase the event logs. The only place this information would be safe is in the per-request logging, since users can't erase that.

For a good example of what a rathole trying to sandbox Python turns into, see this post . For Guido's take on securing Python, see this post .

There are another couple of options: If you're free to choose the language, you could run Rhino (a Javascript interpreter) on the Java runtime; Rhino is nicely sandboxed. You may also be able to use Jython; I don't know if it's practical to sandbox it, but it seems likely.

Alex's suggestion of using a separate app is also a good one. This is pretty much the approach that shell.appspot.com takes: It can't prevent you from doing malicious things, but the app itself stores nothing of value, so there's no harm if you do.

Answer 3

Here's an idea. Instead of running the code server-side, run it client-side with Skuplt:

http://www.skulpt.org/

This is both safer, and easier to implement.

In Python in GAE, what is the best way to limit the risk of executing untrusted code?

Question

3 answers

solution1
5 ACCPTED 2009-09-30 04:47:15

solution2
2 2009-09-30 09:12:03

solution3
0 2011-04-10 20:53:40

In Python in GAE, what is the best way to limit the risk of executing untrusted code?

Question

3 answers

solution1 5 ACCPTED 2009-09-30 04:47:15

solution2 2 2009-09-30 09:12:03

solution3 0 2011-04-10 20:53:40

solution1
5 ACCPTED 2009-09-30 04:47:15

solution2
2 2009-09-30 09:12:03

solution3
0 2011-04-10 20:53:40