簡體   English   中英

Python“請求”和Linux“卷曲”之間的區別

[英]Difference between Python "requests" and Linux "curl"

我嘗試了幾種方法,但我找不到令人滿意的答案 -

Python“請求”模塊和 Linux“curl”命令之間有什么區別? “請求”是否使用底層的“curl”,或者它是處理 HTTP 請求/響應的完全不同的方式?

對於大多數請求,它們的行為方式相同(應該如此),但有時,我發現響應有所不同,真的很難弄清楚為什么會這樣。

例如。 使用curl進行HEAD請求:

curl --head https://historia.sherpadesk.com
HTTP/2 302 
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:30 GMT
access-control-expose-headers: Request-Context
cache-control: private
location: /login/?ref=portal
set-cookie: ASP.NET_SessionId=nghpw4qp5cw2ntwmwfuxw3oi; path=/; HttpOnly; SameSite=Lax
content-length: 135
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000

如果我使用-L來跟隨重定向,

curl --head https://historia.sherpadesk.com -L
HTTP/2 302 
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:37 GMT
access-control-expose-headers: Request-Context
cache-control: private
location: /login/?ref=portal
set-cookie: ASP.NET_SessionId=trzp0bql4nibswux5z5wfayy; path=/; HttpOnly; SameSite=Lax
content-length: 135
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000

HTTP/2 302 
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:38 GMT
access-control-expose-headers: Request-Context
location: https://app.sherpadesk.com/login/?ref=portal
content-length: 161
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000

HTTP/2 200 
content-type: text/html; charset=utf-8
date: Mon, 28 Feb 2022 20:31:39 GMT
access-control-expose-headers: Request-Context
cache-control: no-store, no-cache
expires: -1
pragma: no-cache
set-cookie: ASP.NET_SessionId=aqmnxu2s3qkri3sravsrs1cq; path=/; HttpOnly; SameSite=Lax
content-length: 8935
request-context: appId=cid-v1:d5f9900e-ecd4-442f-9e92-e11b4cdbc0c9
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000

這是我使用 Python 的請求模塊requests.head(url)時的(調試)output:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): historia.sherpadesk.com:443
send: b'HEAD / HTTP/1.1\r\nHost: historia.sherpadesk.com\r\nUser-Agent: python-requests/2.26.0\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 403 Forbidden: Access is denied.\r\n'
header: Content-Length: 58
header: Content-Type: text/html
header: Date: Mon, 28 Feb 2022 20:36:18 GMT
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1
header: X-Content-Type-Options: nosniff
header: Strict-Transport-Security: max-age=31536000
DEBUG:urllib3.connectionpool:https://historia.sherpadesk.com:443 "HEAD / HTTP/1.1" 403 0
INFO:root:URL: https://historia.sherpadesk.com/
INFO:root:<Response [403]>

這只會導致403響應代碼。 無論allow_redirectsTrue/False ,響應都是一樣的。 我也嘗試過將代理與 python 代碼一起使用,因為我認為它可能會被阻止,因為這個 URL 可能正在識別 Python 請求成為機器人,但這也失敗了。 另外,如果是這樣,為什么 curl 會成功?

所以,我的主要問題是: what are the major differences between curl and requests, which might cause difference in responses in certain cases? 如果可能的話,我真的很想得到詳盡的解釋,這可以幫助我調試和解決這些問題。

這兩個庫是不同的,但這里的問題與用戶代理有關。

當我嘗試使用 curl 時,指定python-requests用戶代理:

$ curl  --head -A "python-requests/2.26.0" https://historia.sherpadesk.com/ 
HTTP/2 403 
content-type: text/html
date: Mon, 28 Feb 2022 22:30:02 GMT
content-length: 58
x-frame-options: SAMEORIGIN
x-xss-protection: 1
x-content-type-options: nosniff
strict-transport-security: max-age=31536000

使用 curl 默認用戶代理:

$ curl --head  https://historia.sherpadesk.com/ 
HTTP/2 302
...

顯然,他們有某種類型的網站安全性可以阻止 HTTP 客戶端,例如 python-requests,但出於某種原因不能阻止 curl。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM