简体   繁体   English

如何使用 Rails 应用程序在 Heroku 上正确运行 Selenium Webdriver

[英]How to run Selenium Webdriver correctly on Heroku with a Rails app

I'm implementing a very basic scraper on my app with the watir gem.我正在使用 watir gem 在我的应用程序上实现一个非常基本的刮刀 It runs perfectly fine locally but when I run it on heroku, it triggers this error : Webdrivers::BrowserNotFound: Failed to find Chrome binary.它在本地运行得非常好,但是当我在 heroku 上运行它时,它会触发此错误: Webdrivers::BrowserNotFound: Failed to find Chrome binary.

I added google-chrome and chromedriver buildpacks to my app to tell Selenium where to find Chrome on Heroku but it still doest not work.我在我的应用程序中添加了 google-chrome 和 chromedriver buildpacks,以告诉 Selenium 在哪里可以找到 Heroku 上的 Chrome,但它仍然不起作用。 Moreover, when I print the options, the binary seems to be correctly set:此外,当我打印选项时,二进制文件似乎已正确设置:

#<Selenium::WebDriver::Chrome::Options:0x0000558bdf7ecc30 @args=#<Set: {"--user-data-dir=/app/tmp/chrome", "--no-sandbox", "--window-size=1200x600", "--headless", "--disable-gpu"}>, @binary="/app/.apt/usr/bin/google-chrome-stable", @prefs={}, @extensions=[], @options={}, @emulation={}, @encoded_extensions=[]>

This is my app Buildpack URLs :这是我的应用程序 Buildpack URL:

1. heroku/ruby
2. heroku/google-chrome
3. heroku/chromedriver

This is my code :这是我的代码:

def new_browser(downloads: false)

  options = Selenium::WebDriver::Chrome::Options.new

  chrome_dir = File.join Dir.pwd, %w(tmp chrome)
  FileUtils.mkdir_p chrome_dir
  user_data_dir = "--user-data-dir=#{chrome_dir}"
  options.add_argument user_data_dir

  if chrome_bin = ENV["GOOGLE_CHROME_SHIM"]
    options.add_argument "--no-sandbox"
    options.binary = chrome_bin
  end

  options.add_argument "--window-size=1200x600"
  options.add_argument "--headless"
  options.add_argument "--disable-gpu"

  browser = Watir::Browser.new :chrome, options: options

  if downloads
    downloads_dir = File.join Dir.pwd, %w(tmp downloads)
    FileUtils.mkdir_p downloads_dir

    bridge = browser.driver.send :bridge
    path = "/session/#{bridge.session_id}/chromium/send_command"
    params = { behavior: "allow", downloadPath: downloads_dir }
    bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
                                  params: params)
  end
  browser
end

Any idea how to fix this ?知道如何解决这个问题吗? I checked many similar issues on different websites but I did not find anything.我在不同的网站上检查了许多类似的问题,但没有找到任何东西。

i also work on same thing last two days, and as you said I try a lot of different things.过去两天我也在做同样的事情,正如你所说,我尝试了很多不同的事情。 I finally made it.我终于做到了。

The problem is that heroku use different path where is chromedriver downloaded.问题是heroku使用不同的路径下载chromedriver。 In source code of webdriver gem I found that webdriver looking on default system path for (linux, mac os, windows) and this is reason why works locally or path defined in WD_CHROME_PATH environment variable.在 webdriver gem 的源代码中,我发现 webdriver 在寻找(linux、mac os、windows)的默认系统路径,这就是为什么在本地工作或在 WD_CHROME_PATH 环境变量中定义的路径的原因。 To set path on heroku we must set this env variable要在 heroku 上设置路径,我们必须设置这个 env 变量

"WD_CHROME_PATH": "/app/.apt/usr/bin/google-chrome"

must be google-chrome not google-chrome-stable like we can find on examples.必须是 google-chrome 而不是 google-chrome-stable,就像我们在示例中可以找到的那样。

That is, just run this from terminal:也就是说,只需从终端运行它:

heroku config:set WD_CHROME_PATH=/app/.apt/usr/bin/google-chrome

No solutions worked for me (Heroku-18 stack, with 'https://github.com/heroku/heroku-buildpack-google-chrome.git' and 'https://github.com/heroku/heroku-buildpack-chromedriver' buildpacks).没有解决方案对我有用(Heroku-18 堆栈,带有“https://github.com/heroku/heroku-buildpack-google-chrome.git”和“https://github.com/heroku/heroku-buildpack-chromedriver” ' 构建包)。

I tried all kinds of solutions but finally found a fail proof way to debug it yourself.我尝试了各种解决方案,但最终找到了一种可以自行调试的故障证明方法。

It involves a couple of resources: https://www.simon-neutert.de/2018/watir-chrome-heroku/ and https://github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor in particular.它涉及几个资源: https : //www.simon-neutert.de/2018/watir-chrome-heroku/https://github.com/jormon/minimal-chrome-on-heroku/blob/master/ runner.thor尤其如此。

Check where your actual binary and drivers are on Heroku:检查您的实际二进制文件和驱动程序在 Heroku 上的位置:

$ heroku run bash
~ $ which chromedriver
/app/.chromedriver/bin/chromedriver
~ $ which google-chrome
/app/.apt/usr/bin/google-chrome

The shims that the buildpacks set up for me didn't work. buildpacks 为我设置的垫片不起作用。 In fact, even if you set the values above on Heroku to something different, the buildpacks reset them, so you lose the new shim (see here: https://github.com/heroku/heroku-buildpack-google-chrome/blob/master/bin/compile ) so I made new shims:事实上,即使您将 Heroku 上的上述值设置为不同的值,构建包也会重置它们,因此您会丢失新的垫片(请参阅此处: https : //github.com/heroku/heroku-buildpack-google-chrome/blob /master/bin/compile ) 所以我做了新的垫片:

$ heroku config:set GOOGLE_CHROME_REAL=/app/.apt/usr/bin/google-chrome
$ heroku config:set CHROME_DRIVER_REAL=/app/.chromedriver/bin/chromedriver

Then, I modified the browser initializer (from: https://github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor ):然后,我修改了浏览器初始化程序(来自: https : //github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor ):

def new_browser(downloads: false)
    require 'watir'
    require 'webdrivers'
    options = Selenium::WebDriver::Chrome::Options.new

    # make a directory for chrome if it doesn't already exist
    chrome_dir = File.join Dir.pwd, %w(tmp chrome)
    FileUtils.mkdir_p chrome_dir
    user_data_dir = "--user-data-dir=#{chrome_dir}"
    # add the option for user-data-dir
    options.add_argument user_data_dir

    # let Selenium know where to look for chrome if we have a hint from
    # heroku. chromedriver-helper & chrome seem to work out of the box on osx,
    # but not on heroku.
    if chrome_bin = ENV["GOOGLE_CHROME_REAL"]
        Selenium::WebDriver::Chrome.path = chrome_bin
    end
    if chrome_driver = ENV["CHROME_DRIVER_REAL"]
        Selenium::WebDriver::Chrome.driver_path = chrome_driver
    end

    # headless!
    options.add_argument "--window-size=1200x600"
    options.add_argument "--headless"
    options.add_argument "--disable-gpu"

    # make the browser
    browser = Watir::Browser.new :chrome, options: options

    # setup downloading options
    if downloads
      # make download storage directory
      downloads_dir = File.join Dir.pwd, %w(tmp downloads)
      FileUtils.mkdir_p downloads_dir

      # tell the bridge to use downloads
      bridge = browser.driver.send :bridge
      path = "/session/#{bridge.session_id}/chromium/send_command"
      params = { behavior: "allow", downloadPath: downloads_dir }
      bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
                                    params: params)
    end
    browser
end

Hope this helps others.希望这对其他人有帮助。

I have tried to solve this for a while with different approaches but none of them worked.我试图用不同的方法解决这个问题一段时间,但没有一个奏效。 Then I checked the webdrivers source code and found that you need to set the "WD_CHROME_PATH" env variable for it to work.然后我检查了 webdrivers 源代码,发现您需要设置“WD_CHROME_PATH”环境变量才能使其工作。 Just attaching my full setup here.只需在此处附上我的完整设置。 This cost me a few hours to debug and fix.这花了我几个小时来调试和修复。

spec_helper.rb spec_helper.rb

require 'webdrivers'
require 'capybara/rspec'

 # Heroku build packs need to put the chromedriver binary in a non-standard location specified by GOOGLE_CHROME_SHIM
 chrome_bin = ENV.fetch('GOOGLE_CHROME_SHIM', nil)

 options = {}
 options[:args] = ['headless', 'disable-gpu', 'window-size=1280,1024']
 options[:binary] = chrome_bin if chrome_bin

 Capybara.register_driver :headless_chrome do |app|
   Capybara::Selenium::Driver.new(app,
      browser: :chrome,
      options: Selenium::WebDriver::Chrome::Options.new(options)
    )
 end

 Capybara.javascript_driver = :headless_chrome

Gemfile文件

group :test do
  gem 'capybara'
  gem 'timecop'
  gem 'selenium-webdriver'
  gem 'webdrivers'
end

app.json应用程序.json

{
  "name": "evocal",
  "repository": "https://github.com/zeitdev/evocal",
  "environments": {
    "test": {
      "addons":[
        "heroku-postgresql:in-dyno"
      ],
      "scripts": {
        "test-setup": "bundle exec rake db:seed",
        "test": "bundle exec rspec"
      },
      "buildpacks": [
        { "url": "heroku/ruby" },
        { "url": "https://github.com/heroku/heroku-buildpack-google-chrome" },
        { "url": "https://github.com/heroku/heroku-buildpack-chromedriver" },
        { "url": "heroku/nodejs" }
      ],
      "env": {
        "WD_CHROME_PATH": "/app/.apt/opt/google/chrome/chrome"
      }
    }
  }
}

I don't fully yet understand how selenium, webdriver and the gem interact with each other.我还没有完全理解 selenium、webdriver 和 gem 是如何相互作用的。 Some also wrote that you can leave away another buildpack.有些人还写道,您可以放弃另一个 buildpack。 But this works at least for now :-D.但这至少现在有效:-D。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM