简体   繁体   English

如何使用 python selenium 获取浏览器网络日志

[英]How to get browser network logs using python selenium

I'm trying to get browser network logs using selenium to debug request/responses.我正在尝试使用 selenium 获取浏览器网络日志来调试请求/响应。 Could you please help me to find out a way.你能帮我找出办法吗。

And I'm using selenium 3.14.0 and latest Chrome browser.我正在使用 selenium 3.14.0 和最新的 Chrome 浏览器。

Using python + selenium + firefox使用 python + selenium + firefox

Don't set up a proxy unless you have to- in order to get outbound API requests I used the solution from this answer, but in python: https://stackoverflow.com/a/45859018/14244758除非您必须,否则不要设置代理-为了获得出站 API 请求,我使用了此答案中的解决方案,但在 python 中: Z5E056C500A1C4B6A7110B50D8024455Z//stackoverflow.com/

test = driver.execute_script("var performance = window.performance || window.mozPerformance || window.msPerformance || window.webkitPerformance || {}; var network = performance.getEntries() || {}; return network;")

for item in test:
  print(item)

You get an array of dicts.你得到一个dicts数组。

This allows me to see all the network requests made.这让我可以看到所有的网络请求。 I'm using it to parse out a parameter from one of the requests so that I can use it to make my own requests against the API.我正在使用它从其中一个请求中解析出一个参数,以便我可以使用它对 API 发出我自己的请求。

Using python + selenium + Chrome使用 python + selenium + 铬

EDIT: this answer got a lot of attention, here is how I'm doing it now w/Chrome (taken from undetected-chromedriver code):编辑:这个答案引起了很多关注,这就是我现在使用 Chrome 的方法(取自 undetected-chromedriver 代码):

chrome_options = webdriver.ChromeOptions()
chrome_options.set_capability(
                        "goog:loggingPrefs", {"performance": "ALL", "browser": "ALL"}
                    )
driver = webdriver.Chrome(options=chrome_options)


##visit your website, login, etc. then:
log_entries = driver.get_log("performance")

for entry in log_entries:

    try:
        obj_serialized: str = entry.get("message")
        obj = json.loads(obj_serialized)
        message = obj.get("message")
        method = message.get("method")
        if method in ['Network.requestWillBeSentExtraInfo' or 'Network.requestWillBeSent']:
            try:
                for c in message['params']['associatedCookies']:
                    if c['cookie']['name'] == 'authToken':
                        bearer_token = c['cookie']['value']
            except:
                pass
        print(type(message), method)
        print('--------------------------------------')
    except Exception as e:
        raise e from None

With this method you can get the request cookies and json payload.使用此方法,您可以获得请求 cookies 和 json 有效载荷。

Using Python and ChromeDriver使用 Python 和 ChromeDriver

To get network logs, you need to install BrowserMobProxy as well along with selenium in python要获取网络日志,您还需要在 python 中安装BrowserMobProxy以及 selenium

pip install browsermob-proxy

Then we need to download the browsermobproxy zip from https://bmp.lightbody.net/.然后我们需要从https://bmp.lightbody.net/下载browsermobproxy zip。

Unzip it to any folder(For eg path/to/extracted_folder).将其解压缩到任何文件夹(例如 path/to/extracted_folder)。 This folder contains the browsermob-proxy binary file.此文件夹包含 browsermob-proxy 二进制文件。 We need to mention this path while calling Server() in python code我们需要在 python 代码中调用 Server() 时提及此路径

You need to start browser proxy and configure the proxy in chrome option of chrome driver,您需要启动浏览器代理并在 chrome 驱动程序的 chrome 选项中配置代理,

from browsermobproxy import Server
from selenium import webdriver

server = Server("path/to/extracted_folder/bin/browsermob-proxy")
server.start()
proxy = server.create_proxy()

# Configure the browser proxy in chrome options
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--proxy-server={0}".format(proxy.proxy))
browser = webdriver.Chrome(chrome_options = chrome_options)

#tag the har(network logs) with a name
proxy.new_har("google")

Then you can navigate to page using selenium然后您可以使用 selenium 导航到页面

browser.get("http://www.google.co.in")

After navigation, you can get the network logs in json format from the proxy导航后可以从代理获取json格式的网络日志

print(proxy.har) # returns a Network logs (HAR) as JSON 

Also before quitting the driver, stop the proxy server as well at the end,同样在退出驱动程序之前,最后也要停止代理服务器,

server.stop()
browser.quit()

Try selenium-wire , I think this is a better way which also provides undetected-chromedriver against bot detection.试试selenium-wire ,我认为这是一种更好的方法,它还提供了undetected-chromedriver防止机器人检测。

For the latest python selenium version 4.1.0, webdriver.get_log(self, log_type) only have 4 type logs对于最新的 python selenium 版本 4.1.0,webdriver.get_log(self, log_type) 只有 4 种类型的日志

driver.get_log('browser')
driver.get_log('driver')
driver.get_log('client')
driver.get_log('server')

can't get performace log by driver.get_log function无法通过 driver.get_log function 获取性能日志

To get only the network logs up until the page has finished loading (no ajax/async network logs during the main usage of the page), you can get the Performance Log: http://chromedriver.chromium.org/logging/performance-log要在页面完成加载之前只获取网络日志(在页面的主要使用期间没有 ajax/async 网络日志),您可以获得性能日志: http://chromedriver.chromium.org/logging/performance-日志

To enable the Performance Logging for the ChromeDriver, for example,例如,要为 ChromeDriver 启用性能日志记录,

DesiredCapabilities cap = DesiredCapabilities.chrome();
LoggingPreferences logPrefs = new LoggingPreferences();
logPrefs.enable(LogType.PERFORMANCE, Level.ALL);
cap.setCapability(CapabilityType.LOGGING_PREFS, logPrefs);
RemoteWebDriver driver = new RemoteWebDriver(new URL("http://127.0.0.1:9515"), cap);

The chromium performance-log page also links to this complete example https://gist.github.com/klepikov/5457750 which has Java and python code to get the Performance Logs. The chromium performance-log page also links to this complete example https://gist.github.com/klepikov/5457750 which has Java and python code to get the Performance Logs.

Again, it's important to keep in mind that this will only get the network requests up until the point that the page is finished loading.同样,重要的是要记住,这只会在页面完成加载之前获取网络请求。 After that, the driver will only return the same performance logs until the page reloads.之后,驱动程序只会返回相同的性能日志,直到页面重新加载。


If you want to get network logs asynchronously throughout the usage of the page, you can use BrowserMobProxy to act as a proxy server for your Selenium driver and capture all those network requests.如果您想在整个页面使用过程中异步获取网络日志,您可以使用BrowserMobProxy作为 Selenium 驱动程序的代理服务器并捕获所有这些网络请求。 Then, you can get those captured requests from BrowserMobProxy's generated HAR file: https://github.com/lightbody/browsermob-proxy#using-with-selenium然后,您可以从 BrowserMobProxy 生成的 HAR 文件中获取那些捕获的请求: https://github.com/lightbody/browsermob-proxy#using-with-selenium

// start the proxy
BrowserMobProxy proxy = new BrowserMobProxyServer();
proxy.start(0);

// get the Selenium proxy object
Proxy seleniumProxy = ClientUtil.createSeleniumProxy(proxy);

// configure it as a desired capability
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(CapabilityType.PROXY, seleniumProxy);

// start the browser up
WebDriver driver = new FirefoxDriver(capabilities);

// enable more detailed HAR capture, if desired (see CaptureType for the complete list)
proxy.enableHarCaptureTypes(CaptureType.REQUEST_CONTENT, CaptureType.RESPONSE_CONTENT);

// create a new HAR with the label "yahoo.com"
proxy.newHar("yahoo.com");

// open yahoo.com
driver.get("http://yahoo.com");

// get the HAR data
Har har = proxy.getHar();

Once you have the HAR file, it is a JSON like list of network events that you can work with.一旦你有了 HAR 文件,它就是一个 JSON 之类的网络事件列表,你可以使用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM