簡體   English   中英

Python 請求 - 打印整個 http 請求(原始)?

[英]Python requests - print entire http request (raw)?

在使用requests模塊時,有沒有辦法打印原始 HTTP 請求?

我不只想要標題,我想要請求行、標題和內容打印輸出。 是否可以看到最終從 HTTP 請求構造的內容?

自 v1.2.3 Requests 添加了 PreparedRequest 對象。 根據文檔“它包含將發送到服務器的確切字節”。

可以使用它來漂亮地打印請求,如下所示:

import requests

req = requests.Request('POST','http://stackoverflow.com',headers={'X-Custom':'Test'},data='a=1&b=2')
prepared = req.prepare()

def pretty_print_POST(req):
    """
    At this point it is completely built and ready
    to be fired; it is "prepared".

    However pay attention at the formatting used in 
    this function because it is programmed to be pretty 
    printed and may differ from the actual request.
    """
    print('{}\n{}\r\n{}\r\n\r\n{}'.format(
        '-----------START-----------',
        req.method + ' ' + req.url,
        '\r\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()),
        req.body,
    ))

pretty_print_POST(prepared)

它產生:

-----------START-----------
POST http://stackoverflow.com/
Content-Length: 7
X-Custom: Test

a=1&b=2

然后你可以用這個發送實際的請求:

s = requests.Session()
s.send(prepared)

這些鏈接指向可用的最新文檔,因此它們的內容可能會發生變化:高級 - 准備好的請求API - 較低級別的類

import requests

response = requests.post('http://httpbin.org/post', data={'key1':'value1'})
print(response.request.url)
print(response.request.body)
print(response.request.headers)

Response對象有一個.request屬性,它是發送的原始PreparedRequest對象。

一個更好的主意是使用 requests_toolbelt 庫,它可以將請求和響應作為字符串轉儲出來,以便您打印到控制台。 它處理上述解決方案不能很好處理的文件和編碼的所有棘手情況。

就這么簡單:

import requests
from requests_toolbelt.utils import dump

resp = requests.get('https://httpbin.org/redirect/5')
data = dump.dump_all(resp)
print(data.decode('utf-8'))

來源: https : //toolbelt.readthedocs.org/en/latest/dumputils.html

您只需鍵入以下內容即可安裝它:

pip install requests_toolbelt

注意:此答案已過時。 較新版本的requests支持直接獲取請求內容,如AntonioHerraizS 的回答文檔

requests中獲取請求的真實原始內容是不可能的,因為它只處理更高級別的對象,例如headersmethod type requests用途urllib3發送請求,但urllib3不能與原始數據處理-它使用httplib 這是一個請求的代表性堆棧跟蹤:

-> r= requests.get("http://google.com")
  /usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
  /usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)

httplib機器內部,我們可以看到HTTPConnection._send_request間接使用HTTPConnection._send_output ,它最終創建原始請求正文(如果存在),並使用HTTPConnection.send分別發送它們。 send最終到達套接字。

由於沒有鈎子可以做你想做的事,作為最后的手段,你可以修補httplib來獲取內容。 這是一個脆弱的解決方案,如果更改了httplib ,您可能需要對其進行調整。 如果您打算使用此解決方案分發軟件,您可能需要考慮打包httplib而不是使用系統的,這很容易,因為它是一個純 python 模塊。

唉,事不宜遲,解決方案:

import requests
import httplib

def patch_send():
    old_send= httplib.HTTPConnection.send
    def new_send( self, data ):
        print data
        return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
    httplib.HTTPConnection.send= new_send

patch_send()
requests.get("http://www.python.org")

產生輸出:

GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae

這是一個代碼,它是相同的,但帶有響應頭:

import socket
def patch_requests():
    old_readline = socket._fileobject.readline
    if not hasattr(old_readline, 'patched'):
        def new_readline(self, size=-1):
            res = old_readline(self, size)
            print res,
            return res
        new_readline.patched = True
        socket._fileobject.readline = new_readline
patch_requests()

我花了很多時間尋找這個,所以如果有人需要,我就把它留在這里。

requests支持所謂的事件掛鈎(從 2.23 開始,實際上只有response掛鈎)。 該鈎子可用於請求打印完整的請求-響應對的數據,包括有效的 URL、標題和正文,例如:

import textwrap
import requests

def print_roundtrip(response, *args, **kwargs):
    format_headers = lambda d: '\n'.join(f'{k}: {v}' for k, v in d.items())
    print(textwrap.dedent('''
        ---------------- request ----------------
        {req.method} {req.url}
        {reqhdrs}

        {req.body}
        ---------------- response ----------------
        {res.status_code} {res.reason} {res.url}
        {reshdrs}

        {res.text}
    ''').format(
        req=response.request, 
        res=response, 
        reqhdrs=format_headers(response.request.headers), 
        reshdrs=format_headers(response.headers), 
    ))

requests.get('https://httpbin.org/', hooks={'response': print_roundtrip})

運行它打印:

---------------- request ----------------
GET https://httpbin.org/
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

None
---------------- response ----------------
200 OK https://httpbin.org/
Date: Thu, 14 May 2020 17:16:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

<!DOCTYPE html>
<html lang="en">
...
</html>

如果響應是二進制的,您可能希望將res.text更改為res.content

我使用以下函數來格式化請求。 就像@AntonioHerraizS 一樣,除了它還會在正文中漂亮地打印 JSON 對象,並標記請求的所有部分。

format_json = functools.partial(json.dumps, indent=2, sort_keys=True)
indent = functools.partial(textwrap.indent, prefix='  ')

def format_prepared_request(req):
    """Pretty-format 'requests.PreparedRequest'

    Example:
        res = requests.post(...)
        print(format_prepared_request(res.request))

        req = requests.Request(...)
        req = req.prepare()
        print(format_prepared_request(res.request))
    """
    headers = '\n'.join(f'{k}: {v}' for k, v in req.headers.items())
    content_type = req.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(json.loads(req.body))
        except json.JSONDecodeError:
            body = req.body
    else:
        body = req.body
    s = textwrap.dedent("""
    REQUEST
    =======
    endpoint: {method} {url}
    headers:
    {headers}
    body:
    {body}
    =======
    """).strip()
    s = s.format(
        method=req.method,
        url=req.url,
        headers=indent(headers),
        body=indent(body),
    )
    return s

我有一個類似的功能來格式化響應:

def format_response(resp):
    """Pretty-format 'requests.Response'"""
    headers = '\n'.join(f'{k}: {v}' for k, v in resp.headers.items())
    content_type = resp.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(resp.json())
        except json.JSONDecodeError:
            body = resp.text
    else:
        body = resp.text
    s = textwrap.dedent("""
    RESPONSE
    ========
    status_code: {status_code}
    headers:
    {headers}
    body:
    {body}
    ========
    """).strip()

    s = s.format(
        status_code=resp.status_code,
        headers=indent(headers),
        body=indent(body),
    )
    return s

@AntonioHerraizS 答案的一個分支(如評論中所述,缺少 HTTP 版本)


使用此代碼獲取表示原始 HTTP 數據包的字符串而不發送它:

import requests


def get_raw_request(request):
    request = request.prepare() if isinstance(request, requests.Request) else request
    headers = '\r\n'.join(f'{k}: {v}' for k, v in request.headers.items())
    body = '' if request.body is None else request.body.decode() if isinstance(request.body, bytes) else request.body
    return f'{request.method} {request.path_url} HTTP/1.1\r\n{headers}\r\n\r\n{body}'


headers = {'User-Agent': 'Test'}
request = requests.Request('POST', 'https://stackoverflow.com', headers=headers, json={"hello": "world"})
raw_request = get_raw_request(request)
print(raw_request)

結果:

POST / HTTP/1.1
User-Agent: Test
Content-Length: 18
Content-Type: application/json

{"hello": "world"}

💡 也可以在響應對象中打印請求

r = requests.get('https://stackoverflow.com') raw_request = get_raw_request(r.request) print(raw_request)

test_print.py 內容:

import logging
import pytest
import requests
from requests_toolbelt.utils import dump


def print_raw_http(response):
    data = dump.dump_all(response, request_prefix=b'', response_prefix=b'')
    return '\n' * 2 + data.decode('utf-8')

@pytest.fixture
def logger():
    log = logging.getLogger()
    log.addHandler(logging.StreamHandler())
    log.setLevel(logging.DEBUG)
    return log

def test_print_response(logger):
    session = requests.Session()
    response = session.get('http://127.0.0.1:5000/')
    assert response.status_code == 300, logger.warning(print_raw_http(response))

hello.py 內容:

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, World!'

運行:

 $ python -m flask hello.py
 $ python -m pytest test_print.py

標准輸出:

------------------------------ Captured log call ------------------------------
DEBUG    urllib3.connectionpool:connectionpool.py:225 Starting new HTTP connection (1): 127.0.0.1:5000
DEBUG    urllib3.connectionpool:connectionpool.py:437 http://127.0.0.1:5000 "GET / HTTP/1.1" 200 13
WARNING  root:test_print_raw_response.py:25 

GET / HTTP/1.1
Host: 127.0.0.1:5000
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive


HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 13
Server: Werkzeug/1.0.1 Python/3.6.8
Date: Thu, 24 Sep 2020 21:00:54 GMT

Hello, World!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM