简体   繁体   English

C# Selenium 在页面加载时注入/执行 JS

[英]C# Selenium Inject/execute JS on page load

I'm using .NET Core 6 with C# 10.我正在使用带有 C# 10 的 .NET Core 6。

What I'm trying to achieve is to run Selenium in headless mode whilst remaining "undetectable".我想要实现的是在无头模式下运行 Selenium,同时保持“不可检测”。 I followed the instructions from here: https://intoli.com/blog/not-possible-to-block-chrome-headless/ which provided a page to test your bot: https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html我按照这里的说明进行操作: https ://intoli.com/blog/not-possible-to-block-chrome-headless/ 它提供了一个页面来测试您的机器人: https ://intoli.com/blog/not- 可能阻止 chrome-headless/chrome-headless-test.html

Headless mode causes some JS vars (like window.chrome) to be unset or invalid which causes the bot to be detected.无头模式会导致某些 JS 变量(如 window.chrome)未设置或无效,从而导致检测到机器人。

IJavaScriptExecutor doesn't work since it runs after the page has loaded. IJavaScriptExecutor 不起作用,因为它在页面加载后运行。 The same author mentions that you have to capture the response and inject JS in this article: https://intoli.com/blog/making-chrome-headless-undetectable/ (Putting It All Together section)同一作者在本文中提到您必须捕获响应并注入 JS: https ://intoli.com/blog/making-chrome-headless-undetectable/(将其放在一起部分)

Since the article uses python, I followed this: https://www.automatetheplanet.com/webdriver-capture-modify-http-traffic/ and this: Titanium Web Proxy - Can't modify request body which uses the Titanium Web Proxy library (found here: https://github.com/justcoding121/titanium-web-proxy )由于本文使用 python,我遵循了这个: https ://www.automatetheplanet.com/webdriver-capture-modify-http-traffic/ 和这个: Titanium Web Proxy - Can't modify request body which uses the Titanium Web Proxy library (在这里找到: https ://github.com/justcoding121/titanium-web-proxy)

For testing, I used this site http://www.example.com and tried to modify the response (change something in the HTML, set JS vars, etc)为了测试,我使用了这个站点http://www.example.com并尝试修改响应(更改 HTML 中的某些内容,设置 JS 变量等)

Here is the proxy class:这是代理类:

public static class Proxy
    {
        static ProxyServer proxyServer = new ProxyServer(userTrustRootCertificate: true);
        public static void StartProxy()
        {
            //Run on port 8080, decrypt ssl
            ExplicitProxyEndPoint explicitEndPoint = new ExplicitProxyEndPoint(IPAddress.Any, 8080, true);

            proxyServer.Start();
            proxyServer.AddEndPoint(explicitEndPoint);

            proxyServer.BeforeResponse += OnBeforeResponse;
        }

        static async Task OnBeforeResponse(object sender, SessionEventArgs ev)
        {
            var request = ev.HttpClient.Request;
            var response = ev.HttpClient.Response;

            //Modify title tag in example.com
            if (String.Equals(ev.HttpClient.Request.RequestUri.Host, "www.example.com", StringComparison.OrdinalIgnoreCase))
            {
                var body = await ev.GetResponseBodyAsString();

                body = body.Replace("<title>Example Domain</title>", "<title>Completely New Title</title>");

                ev.SetResponseBodyString(body);
            }
        }

        public static void StopProxy()
        {
            proxyServer.Stop();
        }
}

And here is the selenium code:这是硒代码:

Proxy.StartProxy();

string url = "localhost:8080";
var seleniumProxy = new OpenQA.Selenium.Proxy 
{
    HttpProxy = url,
    SslProxy = url,
    FtpProxy = url
};

ChromeOptions options = new ChromeOptions();
options.AddArgument("ignore-certificate-errors");
options.Proxy = seleniumProxy;

IWebDriver driver = new ChromeDriver(@"C:\ChromeDrivers\103\", options);
driver.Manage().Window.Maximize();
driver.Navigate().GoToUrl("http://www.example.com");

Console.ReadLine();
TornCityBot.Proxy.StopProxy();

When selenium loads http://www.example.com , the <title>Example Domain</title> should be changed to <title>Completely New Title</title> , but there was no change.当 selenium 加载http://www.example.com时, <title>Example Domain</title>应该更改为<title>Completely New Title</title> ,但没有任何变化。 I tried setting the proxy URL as http://localhost:8080, 127.0.0.1:8080, localhost:8080, etc but there was no change.我尝试将代理 URL 设置为 http://localhost:8080、127.0.0.1:8080、localhost:8080 等,但没有任何变化。

As a test, I ran the code and left the proxy on.作为测试,我运行了代码并让代理保持开启状态。 I then ran curl --proxy http://localhost:8080 http://www.example.com in git bash and the output was:然后我在 git bash 中运行curl --proxy http://localhost:8080 http://www.example.com ,输出为:

<!doctype html>
<html>
<head>
    <title>Completely New Title</title>
. . .

The proxy was working, it was modifying the response for the curl command.代理正在工作,它正在修改 curl 命令的响应。 But for some reason, it wasn't working with selenium.但由于某种原因,它不适用于硒。

If you guys have a solution that can also work on HTTPS or a better method to execute JavaScript on page load, that would be great.如果你们有一个也可以在 HTTPS 上工作的解决方案,或者在页面加载时执行 JavaScript 的更好方法,那就太好了。 If it's not possible, then I might need to forget about headless.如果不可能,那么我可能需要忘记无头。

Thanks in advance for any help.提前感谢您的帮助。

Selenium.WebDriver 4.3.0 and ChromeDriver 103 Selenium.WebDriver 4.3.0ChromeDriver 103

Try use the ExecuteCdpCommand method尝试使用ExecuteCdpCommand方法

var options = new ChromeOptions();
options.AddArgument("--headless");
options.AddArgument("--user-agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'");

using var driver = new ChromeDriver(options);

Dictionary<string, object> cmdParams= new();
cmdParams.Add("source", "Object.defineProperty(navigator, 'webdriver', { get: () => false });");
driver.ExecuteCdpCommand("Page.addScriptToEvaluateOnNewDocument", cmdParams);

With this piece of code we bypass the first two but if you follow the guide you've already mentioned i think it's easy to bypass the rest.使用这段代码,我们绕过了前两个,但是如果您按照您已经提到的指南进行操作,我认为绕过其余部分很容易。

在此处输入图像描述

UPDATE更新

var initialScript = @"Object.defineProperty(Notification, 'permission', {
                        get: function () { return ''; }
                        })                                
                        window.chrome = true
                        Object.defineProperty(navigator, 'webdriver', {
                        get: () => false})  
                        Object.defineProperty(window, 'chrome', {
                        get: () => true})  
                        Object.defineProperty(navigator, 'plugins', {
                        writeable: true,
                        configurable: true,
                        enumerable: true,
                        value: 'works'})                        
                        navigator.plugins.length = 1                                
                        Object.defineProperty(navigator, 'language', {
                        get: () => 'el - GR'});
                        Object.defineProperty(navigator, 'deviceMemory', {
                        get: () => 8});
                        Object.defineProperty(navigator, 'hardwareConcurrency', {
                        get: () => 8});";

cmdParams.Add("source", initialScript);
driver.ExecuteCdpCommand("Page.addScriptToEvaluateOnNewDocument", cmdParams);

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM