![](/img/trans.png)
[英]How can I write a json webpage into a json file using selenium driver.page_source?
[英]How to get the raw JSON response of a HTTP request from `driver.page_source` in Selenium webdriver Firefox
如果我瀏覽到https://httpbin.org/headers
我希望得到以下 JSON 響應:
{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.5",
"Connection": "close",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
}
}
但是,如果我使用硒
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
url = 'https://httpbin.org/headers'
driver.get(url)
print(driver.page_source)
driver.close()
我得到
<html platform="linux" class="theme-light" dir="ltr"><head><meta http-equiv="Content-Security-Policy" content="default-src 'none' ; script-src resource:; "><link rel="stylesheet" type="text/css" href="resource://devtools-client-jsonview/css/main.css"><script type="text/javascript" charset="utf-8" async="" data-requirecontext="_" data-requiremodule="viewer-config" src="resource://devtools-client-jsonview/viewer-config.js"></script><script type="text/javascript" charset="utf-8" async="" data-requirecontext="_" data-requiremodule="json-viewer" src="resource://devtools-client-jsonview/json-viewer.js"></script></head><body><div id="content"><div id="json">{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.5",
"Connection": "close",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
}
}
</div></div><script src="resource://devtools-client-jsonview/lib/require.js" data-main="resource://devtools-client-jsonview/viewer-config.js"></script></body></html>
HTML 標簽從何而來? 如何從driver.page_source
獲取 HTTP 請求的原始 JSON 響應?
在您的網址中使用“view-source:”參數
簡單模式:
例子:
url = 'view-source:https://httpbin.org/headers'
driver.get(url)
content = driver.page_source
print(content)
輸出:
'<html><head><meta name="viewport" content="width=device-width"><title>https://httpbin.org/headers</title><link rel="stylesheet" type="text/css" href="resource://content-accessible/viewsource.css"></head><body id="viewsource" class="highlight" style="-moz-tab-size: 4"><pre>{\n "headers": {\n "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", \n "Accept-Encoding": "gzip, deflate, br", \n "Accept-Language": "en-US,en;q=0.5", \n "Host": "httpbin.org", \n "Upgrade-Insecure-Requests": "1", \n "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0"\n }\n}\n</pre></body></html>'
最佳模式:(對於 JSON)
例子:
url = 'view-source:https://httpbin.org/headers'
driver.get(url)
content = driver.page_source
content = driver.find_element_by_tag_name('pre').text
parsed_json = json.loads(content)
print(parsed_json)
輸出:
{'headers': {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.5',
'Host': 'httpbin.org',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0'}}
除了原始 JSON 響應之外, driver.page_source
還包含 HTML 以在瀏覽器中“漂亮地打印”響應。 如果您使用 Firefox DOM 和 Style Inspector 在瀏覽器中查看 JSON 響應的來源,您將獲得相同的結果。
要獲取原始 JSON 響應,您可以像往常一樣導航 HTML 元素:
print(driver.find_element_by_xpath("//div[@id='json']").text)
這篇文章幫助我解決了 firefox 的問題: https : //blog.francium.tech/firefox-selenium-disable-json-formatting-cfaf466fd20f
我已將此首選項添加到我的驅動程序工廠:
from selenium.webdriver.firefox.options import Options as FirefoxOptions
@staticmethod
def get_firefox_options(headless):
options = FirefoxOptions()
options.set_preference('devtools.jsonview.enabled', False)
if headless:
options.headless = True
return options
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.