简体   繁体   中英

Running a .py file with selenium in docker

I have a python script that scrapes some web information using selenium. I've build a docker image of my project:

FROM python:3.7-slim

WORKDIR /

COPY requirements.txt ./

RUN pip install --upgrade pip && pip install -r requirements.txt

COPY . .

RUN pip install -e .

CMD ["python", "src/project/scraper.py"]

I get the following error when I run it: selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home

The chromedriver.exe file is located in a data folder and the.py script refers to the right place (it does run locally).

Does anyone know how I would be able to run chrome in this container? map structure is as follows:

|-- data
|  |--chromedriver.exe
|  |--file.csv
|-- src
|  |--project
|     |--scraper.py
|-- Dockerfile
|-- requirements.txt

Thanks!

I am expecting you are using Linux containers in Docker, since python:3.7-slim is a Linux image. You cannot execute Windows binaries (.exe) files in Linux. Therefore you need to install chromedriver on Linux: How to Setup Selenium with ChromeDriver on Ubuntu 18.04 & 16.04

Your Dockerfile should look something like this

FROM python:3.7-slim

# install chromedriver
RUN apt-get update && \
    apt-get install -y unzip xvfb libxi6 libgconf-2-4 && \
    apt-get install default-jdk && \
    curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add && \
   echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list && \
   apt-get -y update && \
   wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip && \
   unzip chromedriver_linux64.zip && \
   mv chromedriver /usr/bin/chromedriver && \
   chown root:root /usr/bin/chromedriver && \
   chmod +x /usr/bin/chromedriver    

WORKDIR /

COPY requirements.txt ./

RUN pip install --upgrade pip && pip install -r requirements.txt

COPY . .

RUN pip install -e .

CMD ["python", "src/project/scraper.py"]

Let me share what has worked for me in the past.

Try installing chrome, chromedriver, and the PATH from within the DockerFile.


Note:

  • Using Python 3.8. You can try changing it to 3.7 and see if it works for you.
  • NOT configured for multi-stage builds.
    • ie You may want to remove "FROM python:3.7-slim" before appending your part at the end.
FROM python:3.8 AS builder

RUN apt-get update; apt-get clean

# Install chrome dependencies
RUN apt-get install -y x11vnc xvfb fluxbox wget wmctrl unzip

# Set up the Chrome PPA
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list

# Update the package list and install chrome
RUN apt-get update -y
RUN apt-get install -y google-chrome-stable

# Set up Chromedriver Environment variables
ENV CHROMEDRIVER_VERSION 87.0.4280.88
ENV CHROMEDRIVER_DIR /chromedriver
RUN mkdir $CHROMEDRIVER_DIR

# Download and install Chromedriver
RUN wget -q --continue -P $CHROMEDRIVER_DIR "http://chromedriver.storage.googleapis.com/$CHROMEDRIVER_VERSION/chromedriver_linux64.zip"
RUN unzip $CHROMEDRIVER_DIR/chromedriver* -d $CHROMEDRIVER_DIR

# Put Chromedriver into the PATH
ENV PATH $CHROMEDRIVER_DIR:$PATH

RUN python -m venv /opt/venv
# Make sure we use the virtualenv:
ENV PATH="/opt/venv/bin:$PATH"

...
<YOUR_DOCKERFILE_PARTS>

As of today (Jan 25th, 2021), I can check the latest stable version (released Jan 19th, 2021) of google chrome is 88.0.4324.96. So if the above don't work, try changing the Chromedriver version so that it matches with the installed chrome browser.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM