简体   繁体   中英

Using Selenium in python with headless Chrome from a Docker container

I'm creating a Dockerfile from an official base jupyter sci-py image (docs here , Dockerfile here ).

FROM jupyter/scipy-notebook

USER root

# bash instead of dash to use source
RUN ln -snf /bin/bash /bin/sh

RUN sudo apt-get update
RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install

USER jovyan

RUN pip install --upgrade pip \
 && pip install gspread \
 && pip install isort \
 && pip install jupyter_contrib_nbextensions \ 
 && pip install nbdime \
 && pip install pathlib \
 && pip install selenium \
 && nbdime extensions --enable

RUN jupyter contrib nbextension install --user

RUN jupyter nbextension enable autosavetime/main \
 && jupyter nbextension enable codefolding/edit \ 
 && jupyter nbextension enable code_prettify/isort \
 && jupyter nbextension enable scratchpad/main \
 && jupyter nbextension enable splitcell/splitcell \
 && jupyter nbextension enable table_beautifier/main \
 && jupyter nbextension enable code_prettify/2to3 \
 && jupyter nbextension enable init_cell/main \
 && jupyter nbextension enable nbextensions_configurator/tree_tab/main \
 && jupyter nbextension enable spellchecker/main \
 && jupyter nbextension enable toc2/main \
 && jupyter nbextension enable toggle_all_line_numbers/main \
 && jupyter nbextension enable varInspector/main

I am running this container with

docker run -v my_dir:/home/jovyan/work -p 8888:8888 -a stdin -a stdout -i -t my_image /bin/bash

The directory I'm mounting contains the chromedriver executable.

When I open my Jupyter notebook and run the following code

import datetime
import os

import pandas as pd
import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys

chrome_path = '/home/jovyan/work/data_analysis/notebooks/sandbox/miguel/tests/chromedriver'
chrome_options = Options()  
chrome_options.add_argument("--headless")  
# chrome_options.binary_location = '/Applications/Google Chrome   Canary.app/Contents/MacOS/Google Chrome Canary'  
driver = webdriver.Chrome(executable_path=chrome_path, chrome_options=chrome_options)

I get the error

OSError: [Errno 8] Exec format error: '/home/jovyan/work/data_analysis/notebooks/sandbox/miguel/tests/chromedriver'

These may be helpful in tracing the error:

  • (from Jupyter notebook) !pwd returns /home/jovyan/work/data_analysis/notebooks/sandbox/miguel/tests
  • (from Jupyter notebook) !ls returns chromedriver among other files
  • (from Jupyter notebook) !google-chrome --version returns Google Chrome 68.0.3440.75

I've googled the error but couldn't find an answer. Also, if there is a simpler/better way of achieving this (using Selenium with Chrome from a Docker container), I'd be happy to take another approach.

This is likely doable using jupyter/scipy-notebook as a base image, however I used debian:stable .

I create my project file using $ touch script.py and use Selenium to instantiate a Chrome browser instance (and perform a test request to verify that Selenium is working properly):

from selenium import webdriver

options = webdriver.chrome.options.Options()
options.add_argument("--no-sandbox")
options.add_argument("--disable-setuid-sandbox")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options)

driver.get("https://httpstat.us/200")

if "200 OK" in driver.page_source:
    print('Selenium successfully opened with Chrome (under the Xvfb display) and navigated to "https://httpstat.us/200", you\'re all set!')

Then, I create the invocation shell script using $ touch run.sh . I want to use Xvfb to create a X Windows server within this container for the Chrome browser instance:

# Below is the reason for "-nolisten tcp" (this is not documented within Xvfb manpages)
# https://superuser.com/questions/855019/make-xvfb-listen-only-on-local-ip
Xvfb :99 -screen 0 640x480x8 -nolisten tcp &
python3 test.py

Now, I'll create the following Dockerfile .

First, I'm going to install chromium, Xvfb, and Python:

FROM debian:stable 
LABEL maintainer "Sean Pianka"

RUN apt-get update -y && apt-get install -y wget curl unzip libgconf-2-4
RUN apt-get update -y && apt-get install -y chromium xvfb python3 python3-pip 
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/

Then, I'll create my project's directory, install selenium (and/or my project's dependencies), and copy my project code into the image.

RUN mkdir -p /opt/app
WORKDIR /opt/app
RUN pip3 install selenium
## or install from dependencies.txt, comment above and uncomment below
#COPY requirements.txt .
#RUN pip3 install -r requirements.txt
COPY test.py .

Lastly, I'll set DISPLAY to an open display port for Xvfb to use for the X Windows server it creates, copy the run.sh script into the image, and invoke the script using /bin/bash .

# Set display port and dbus env to avoid hanging
ENV DISPLAY=:99
ENV DBUS_SESSION_BUS_ADDRESS=/dev/null
# Bash script to invoke xvfb, any preliminary commands, then invoke project
COPY run.sh .
CMD /bin/bash run.sh

This completes the Dockerfile , and now if you create a container using the image of this, you will see the following output from our test.py file:

Selenium successfully opened with Chrome (under the Xvfb display) and navigated to " https://httpstat.us/200 ", you're all set!

If you would like pre-written Dockerfiles for any combination of Python 2/Python 3 and Chrome/Firefox, see my repository on GitHub which contains these different Dockerfile versions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM