[英]Python33 - Improving server security with BaseHTTPRequestHandler
I have lately been improving security on my webserver, which I wrote myself using http.server
and BaseHTTPRequestHandler
. 我最近一直在提高Web服务器的安全性,我使用http.server
和BaseHTTPRequestHandler
编写了自己的Web服务器。 I have blocked ( 403
'd) most essential server files, which I do not want users to be able to access. 我已阻止( 403
)最重要的服务器文件,我不希望用户能够访问这些文件。 Files include the python server script and all databases, plus some HTML templates. 文件包括python服务器脚本和所有数据库,以及一些HTML模板。
However, in this post on stackoverflow I read that using open(curdir + sep + self.path)
in a do_GET request might potentially make every file on your computer readable. 但是,在这篇关于stackoverflow的文章中,我读到在do_GET请求中使用open(curdir + sep + self.path)
可能会使计算机上的每个文件都可读。 Can someone explain this to me? 谁可以给我解释一下这个? If the self.path
is ip:port/index.html
every time, how can someone access files that are above the root /
directory? 如果self.path
是ip:port/index.html
每一次,怎么能说是根上面有人访问文件/
目录?
I understand that the user (obviously) can change the index.html
to anything else, but I don't see how they can access directories above root
. 我知道用户(显然)可以将index.html
更改为其他任何内容,但是我看不出他们如何访问root
之上的root
。
Also if you're wondering why I'm not using nginx
or apache
, I wanted to create my own web server and website for learning purposes. 另外,如果您想知道为什么我不使用nginx
或apache
,我想创建自己的Web服务器和网站来进行学习。 I have no intention to run an actual website myself, and if I do want to, I will probably rent a server or use existing server software. 我无意自己运行一个实际的网站,如果愿意,我可能会租用服务器或使用现有的服务器软件。
class Handler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
try:
if "SOME BLOCKED FILE OR DIRECTORY" in self.path:
self.send_error(403, "FORBIDDEN")
return
#I have about 6 more of these 403 parts, but I left them out for readability
if self.path.endswith(".html"):
if self.path.endswith("index.html"):
#template is the Template Engine that I created to create dynamic HTML content
parser = template.TemplateEngine()
content = parser.get_content("index", False, "None", False)
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write(content.encode("utf-8"))
return
elif self.path.endswith("auth.html"):
parser = template.TemplateEngine()
content = parser.get_content("auth", False, "None", False)
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write(content.encode("utf-8"))
return
elif self.path.endswith("about.html"):
parser = template.TemplateEngine()
content = parser.get_content("about", False, "None", False)
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write(content.encode("utf-8"))
return
else:
try:
f = open(curdir + sep + self.path, "rb")
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write((f.read()))
f.close()
return
except IOError as e:
self.send_response(404)
self.send_header("Content-type", "text/html")
self.end_headers()
return
else:
if self.path.endswith(".css"):
h1 = "Content-type"
h2 = "text/css"
elif self.path.endswith(".gif"):
h1 = "Content-type"
h2 = "gif"
elif self.path.endswith(".jpg"):
h1 = "Content-type"
h2 = "jpg"
elif self.path.endswith(".png"):
h1 = "Content-type"
h2 = "png"
elif self.path.endswith(".ico"):
h1 = "Content-type"
h2 = "ico"
elif self.path.endswith(".py"):
h1 = "Content-type"
h2 = "text/py"
elif self.path.endswith(".js"):
h1 = "Content-type"
h2 = "application/javascript"
else:
h1 = "Content-type"
h2 = "text"
f = open(curdir+ sep + self.path, "rb")
self.send_response(200)
self.send_header(h1, h2)
self.end_headers()
self.wfile.write(f.read())
f.close()
return
except IOError:
if "html_form_action.asp" in self.path:
pass
else:
self.send_error(404, "File not found: %s" % self.path)
except Exception as e:
self.send_error(500)
print("Unknown exception in do_GET: %s" % e)
You're making an invalid assumption: 您所做的假设无效:
If the
self.path
isip:port/index.html
every time, how can someone access files that are above the root / directory? 如果self.path
都是ip:port/index.html
,那么有人如何访问根目录下的文件?
But self.path
is never ip:port/index.html
. 但是self.path
是从来没有 ip:port/index.html
。 Try logging it and see what you get. 尝试记录下来,看看会得到什么。
For example, if I request http://example.com:8080/foo/bar/index.html
, the self.path
is not example.com:8080/foo/bar/index.html
, but just /foo/bar/index.html
. 例如,如果我请求http://example.com:8080/foo/bar/index.html
,则self.path
不是example.com:8080/foo/bar/index.html
,而只是/foo/bar/index.html
。 In fact, your code couldn't possibly work otherwise, because curdir+ sep + self.path
would give you a path starting with ./example.com:8080/
, which won't exist. 实际上,您的代码可能无法正常运行,因为curdir+ sep + self.path
会为您提供以./example.com:8080/
开头的路径,该路径将不存在。
And then ask yourself what happens if it's /../../../../../../../etc/passwd
. 然后问自己,如果是/../../../../../../../etc/passwd
会发生什么。
This is one of many reasons to use os.path
instead of string manipulation for paths. 这是使用os.path
而不是对路径进行字符串处理的众多原因之一。 For examples, instead of this: 例如,代替此:
f = open(curdir + sep + self.path, "rb")
Do this: 做这个:
path = os.path.abspath(os.path.join(curdir, self.path))
if os.path.commonprefix((path, curdir)) != curdir:
# illegal!
I'm assuming that curdir
here is an absolute path, not just from os import curdir
or some other thing that's more likely to give you .
我假设这里的curdir
是一条绝对路径,而不仅仅是from os import curdir
或其他更有可能为您提供的东西.
than anything else. 比什么都重要。 If it's the latter, make sure to abspath
it as well. 如果是后者,请确保也abspath
。
This can catch other ways of escaping the jail as well as passing in ..
strings… but it's not going to catch everything. 这可以捕获逃脱监狱以及传递..
字符串的其他方法……但是它并不能捕获所有内容。 For example, if there's a symlink pointing out of the jail, there's no way abspath
can tell that someone's gone through the symlink. 例如,如果有一个指向监狱的符号链接,则绝对abspath
告诉某人已通过符号链接。
self.path
contains the request path. self.path
包含请求路径。 If I were to send a GET
request and ask for the resource located at /../../../../../../../etc/passwd
, I would break out of your application's current folder and be able to access any file on your filesystem (that you have permission to read). 如果我要发送GET
请求并请求位于/../../../../../../../etc/passwd
的资源,我将退出应用程序的当前文件夹并能够访问文件系统上的任何文件(您具有读取权限)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.