简体   繁体   English

Python扭曲:HTTPS API的反向代理:无法连接

[英]Python-Twisted: Reverse Proxy to HTTPS API: Could not connect

I am trying to build a reverse-proxy to talk to certain APIs(like Twitter, Github, Instagram) that I can then call with my reverse-proxy to any (client) applications I want (think of it like an API-manager). 我正在尝试建立反向代理以与某些API(例如Twitter,Github,Instagram)进行通信,然后我可以使用反向代理将其调用到所需的任何(客户端)应用程序(将其视为API经理) 。

Also, I am using an LXC-container to do this. 另外,我正在使用LXC容器来执行此操作。

For example, here is the simplest of code that I hacked from the examples on the Twisted Docs: 例如,这是我从Twisted Docs上的示例中窃取的最简单的代码:

from twisted.internet import reactor
from twisted.web import proxy, server
from twisted.python.log import startLogging
from sys import stdout
startLogging(stdout)

site = server.Site(proxy.ReverseProxyResource('https://api.github.com/users/defunkt', 443, b''))
reactor.listenTCP(8080, site)
reactor.run()

When I do CURL within the container, I get a valid request (meaning I get the appropriate JSON response). 当我在容器中执行CURL时,我得到一个有效的请求(意味着我得到了适当的JSON响应)。

Here is how I used the CURL command: 这是我使用CURL命令的方式:

curl https://api.github.com/users/defunkt

And here is the output I get: 这是我得到的输出:

{
  "login": "defunkt",
  "id": 2,
  "avatar_url": "https://avatars.githubusercontent.com/u/2?v=3",
  "gravatar_id": "",
  "url": "https://api.github.com/users/defunkt",
  "html_url": "https://github.com/defunkt",
  "followers_url": "https://api.github.com/users/defunkt/followers",
  "following_url": "https://api.github.com/users/defunkt/following{/other_user}",
  "gists_url": "https://api.github.com/users/defunkt/gists{/gist_id}",
  "starred_url": "https://api.github.com/users/defunkt/starred{/owner}{/repo}",
  "subscriptions_url": "https://api.github.com/users/defunkt/subscriptions",
  "organizations_url": "https://api.github.com/users/defunkt/orgs",
  "repos_url": "https://api.github.com/users/defunkt/repos",
  "events_url": "https://api.github.com/users/defunkt/events{/privacy}",
  "received_events_url": "https://api.github.com/users/defunkt/received_events",
  "type": "User",
  "site_admin": true,
  "name": "Chris Wanstrath",
  "company": "GitHub",
  "blog": "http://chriswanstrath.com/",
  "location": "San Francisco",
  "email": "chris@github.com",
  "hireable": true,
  "bio": null,
  "public_repos": 107,
  "public_gists": 280,
  "followers": 15153,
  "following": 208,
  "created_at": "2007-10-20T05:24:19Z",
  "updated_at": "2016-02-26T22:34:27Z"
}

However, when I attempt fetching the proxy via Firefox using: 但是,当我尝试使用以下方式通过Firefox获取代理时:

http://10.5.5.225:8080/ http://10.5.5.225:8080/

I get: "Could not connect" 我得到:“无法连接”

This is what my Twisted log looks like: 这是我的扭曲日志的样子:

2016-02-27 [-] Log opened. 2016-02-27 [-]日志已打开。

2016-02-27 [-] Site starting on 8080 2016-02-27 [-]网站开始于8080

2016-02-27 [-] Starting factory 2016-02-27 [-]入厂

2016-02-27 [-] Starting factory 2016-02-27 [-]入厂

2016-02-27 [-] "10.5.5.225" - - [27/Feb/2016: +0000] "GET / HTTP/1.1" 501 26 "-" "Mozilla/5.0 (X11; Debian; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0" 2016-02-27 [-]“ 10.5.5.225”--[27 / Feb / 2016:+0000]“ GET / HTTP / 1.1” 501 26“-”“ Mozilla / 5.0(X11; Debian; Linux x86_64; rv :44.0)Gecko / 20100101 Firefox / 44.0“

2016-02-27 [-] Stopping factory 2016-02-27 [-]停厂

How can I use Twisted to make an API call (most APIs are HTTPS nowadays anyway) and get the required response (basically, what the "200" response/JSON should be)? 如何使用Twisted进行API调用(无论如何,如今大多数API都是HTTPS)并获得所需的响应(基本上,“ 200”响应/ JSON应该是什么)?

I tried looking at this question: Convert HTTP Proxy to HTTPS Proxy in Twisted 我尝试着看这个问题: 在Twisted中将HTTP代理转换为HTTPS代理

But it didn't make much sense from a coding point-of-view (or mention anything about reverse-proxying). 但这从编码的角度来看并没有多大意义(或提及任何有关反向代理的问题)。

**Edit: I also tried switching out the HTTPS API call for a regular HTTP call using: **编辑:我还尝试使用以下方法将HTTPS API调用切换为常规HTTP调用:

curl http[colon][slash][slash]openlibrary[dot]org[slash]authors[slash]OL1A.json curl http [冒号] [slash] [slash] openlibrary [dot] org [slash] authors [slash] OL1A.json

(URL above has been formatted to avoid link-conflict issue) (上面的URL已经过格式化,以避免链接冲突问题)

However, I still get the same error in my browser (as mentioned above). 但是,我在浏览器中仍然遇到相同的错误(如上所述)。

**Edit2: I have tried running your code, but I get this error: ** Edit2:我曾尝试运行您的代码,但出现此错误:

Error-screenshot 错误截图

If you look at the image, you will see the error (when running the code) of: 如果查看图像,将看到以下错误(运行代码时):

builtins.AttributeError: 'str' object has no attribute 'decode' Builtins.AttributeError:'str'对象没有属性'decode'

If you read the API documentation for ReverseProxyResource , you will see that the signature of __init__ is: 如果您阅读了ReverseProxyResourceAPI文档 ,则会看到__init__的签名是:

def __init__(self, host, port, path, reactor=reactor):

and " host " is documented as "the host of the web server to proxy". 和“ host ”被记录为“要代理的Web服务器的主机”。

So you are passing a URI where Twisted expects a host. 因此,您正在传递Twisted需要主机的URI。

Worse yet, ReverseProxyResource is designed for local use on a web server, and doesn't quite support https:// URLs out of the box. 更糟糕的是, ReverseProxyResource是为在Web服务器上本地使用而设计的,并且不完全支持https:// URL。

It does have a (very limited) extensibility hook though - proxyClientFactoryClass - and to apologize for ReverseProxyResource not having what you need out of the box, I will show you how to use that to extend ReverseProxyResource to add https:// support so you can use the GitHub API :). 确实有一个(非常有限的)可扩展性钩子proxyClientFactoryClass并且为ReverseProxyResource歉意没有开箱即用,我将向您展示如何使用它来扩展ReverseProxyResource以添加https://支持,以便您可以使用GitHub API :)。

from twisted.web import proxy, server
from twisted.logger import globalLogBeginner, textFileLogObserver
from twisted.protocols.tls import TLSMemoryBIOFactory
from twisted.internet import ssl, defer, task, endpoints
from sys import stdout
globalLogBeginner.beginLoggingTo([textFileLogObserver(stdout)])

class HTTPSReverseProxyResource(proxy.ReverseProxyResource, object):
    def proxyClientFactoryClass(self, *args, **kwargs):
        """
        Make all connections using HTTPS.
        """
        return TLSMemoryBIOFactory(
            ssl.optionsForClientTLS(self.host.decode("ascii")), True,
            super(HTTPSReverseProxyResource, self)
            .proxyClientFactoryClass(*args, **kwargs))
    def getChild(self, path, request):
        """
        Ensure that implementation of C{proxyClientFactoryClass} is honored
        down the resource chain.
        """
        child = super(HTTPSReverseProxyResource, self).getChild(path, request)
        return HTTPSReverseProxyResource(child.host, child.port, child.path,
                                         child.reactor)

@task.react
def main(reactor):
    import sys
    forever = defer.Deferred()
    myProxy = HTTPSReverseProxyResource('api.github.com', 443,
                                        b'/users/defunkt')
    myProxy.putChild("", myProxy)
    site = server.Site(myProxy)
    endpoint = endpoints.serverFromString(
        reactor,
        dict(enumerate(sys.argv)).get(1, "tcp:8080:interface=127.0.0.1")
    )
    endpoint.listen(site)
    return forever

If you run this, curl http://localhost:8080/ should do what you expect. 如果运行此命令,则curl http://localhost:8080/应该可以完成您的期望。

I've taken the liberty of modernizing your Twisted code somewhat; 我已经采取了某种方式使您的Twisted代码现代化。 endpoints instead of listenTCP , logger instead of twisted.python.log , and react instead of starting the reactor yourself. 端点而不是listenTCP ,用logger而不是twisted.python.log ,并react而不是自己启动反应堆。

The weird little putChild piece at the end there is because when we pass b"/users/defunkt" as the path, that means a request for / will result in the client requesting /users/defunkt/ (note the trailing slash), which is a 404 in GitHub's API. 末尾的奇怪的小putChild部分是因为当我们通过b"/users/defunkt"作为路径时,这意味着对/的请求将导致客户端请求/users/defunkt/ (请注意/users/defunkt/斜杠),这是GitHub API中的404。 If we explicitly proxy the empty-child-segment path as if it did not have the trailing segment, I believe it will do what you expect. 如果我们明确地将空子段路径代理为没有尾段,我相信它将按照您的期望进行。

PLEASE NOTE : proxying from plain-text HTTP to encrypted HTTPS can be extremely dangerous , so I've added a default listening interface here of localhost-only. 请注意 :从纯文本HTTP代理到加密的HTTPS可能非常危险 ,因此我在此处添加了仅本地主机的默认侦听接口。 If your bytes transit over an actual network, you should ensure that they are properly encrypted with TLS. 如果字节在实际网络上传输,则应确保已使用TLS对其进行了正确加密。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM