無法使用 Python 中的請求從站點獲取數據

Question

我正在嘗試從此站點獲取文本。 它只是一個簡單的純文本網站。 運行下面的代碼時，它唯一打印出來的是換行符。 我應該說網站內容/文本是動態的，所以它會在幾分鍾內發生變化。 我的requests模塊版本是 2.27.1。 我在 Windows 上使用 Python 3.9。

可能是什么問題呢？

import requests

url='https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36',
}

content=requests.get(url, headers=headers)
print(content.text)

這是網站外觀的示例。

Answer 1

該特定服務器似乎不是在用戶代理上，而是在接受編碼設置上門控響應。 您可以通過以下方式獲得正常響應：

import requests
url = "https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN"
headers = {
    "Accept-Encoding": "gzip, deflate, br",
}
content = requests.get(url, headers=headers)
print(content.text)

根據服務器隨時間響應的方式，您可能需要安裝brotli package 以允許請求解壓縮使用它壓縮的內容。

Answer 2

您只需要像下面這樣添加用戶代理。

import requests

url = "https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN"

payload={}
headers = {
    'User-Agent': 'PostmanRuntime/7.29.0',
    'Accept': '*/*',
    'Cache-Control': 'no-cache',
    'Host': 'www.spaceweatherlive.com',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive'
 }
response = requests.get(url, headers=headers)
print(response.text)

無法使用 Python 中的請求從站點獲取數據

問題描述

2 個解決方案

解決方案1
1 2022-01-25 22:25:33

解決方案2
0 2022-01-25 21:05:33

無法使用 Python 中的請求從站點獲取數據

問題描述

2 個解決方案

解決方案1 1 2022-01-25 22:25:33

解決方案2 0 2022-01-25 21:05:33

解決方案1
1 2022-01-25 22:25:33

解決方案2
0 2022-01-25 21:05:33