简体   繁体   English

如何在 Linux 服务器上使用 docker-compose 安装 phantomjs 和 selenium?

[英]How to install phantomjs and selenium with docker-compose on linux server?

I am using selenium and phantomjs for my web scraper.我正在为我的网络抓取工具使用 selenium 和 phantomjs。 All works great with my test windows app.一切都适用于我的测试 Windows 应用程序。 Trying to add this code update to my main app, deployed with docker-compose, and i get this: selenium.common.exceptions.WebDriverException: Message: 'phantomjs' executable needs to be in PATH.尝试将此代码更新添加到我的主应用程序中,使用 docker-compose 部署,我得到了这个: selenium.common.exceptions.WebDriverException: Message: 'phantomjs' executable needs to be in PATH.

How should i fix this?我应该如何解决这个问题? currently my docker-compose.yml has this code:目前我的 docker-compose.yml 有这个代码:

version: '3.1'

services:

  tgbot:
    container_name: bot
    build:
      context: .
    command: python app.py
    restart: always
    environment:
      WEBAPP_PORT: 3001
    env_file:
      - ".env"
    # bot start after load db
    ports:
      - 8443:3001
    networks:
      - botnet

  phantomjs:
    image: shufo/phantomjs
    command: --webdriver 8901

networks:
      botnet:
        driver: bridge

And my python code:还有我的python代码:

from selenium import webdriver
driver = webdriver.PhantomJS()

Dockefile:文件:

FROM python:latest

RUN mkdir /src
WORKDIR /src
COPY requirements.txt /src
RUN pip install -r requirements.txt
COPY . /src

PS i am using phantomjs because webpage i am scraping has JS. PS我正在使用phantomjs,因为我正在抓取的网页有JS。 doesnt work with chrome不适用于 chrome

There are few problems with your configuration:您的配置有几个问题:

  1. Your bot code is working in different container.您的机器人代码在不同的容器中工作。 Not in that one that launches phantomjs.不是在启动 phantomjs 的那个。 This is why it cannot find the executable.这就是它找不到可执行文件的原因。
  2. You run phantomjs container not within the same network as your code您运行的 phantomjs 容器与您的代码不在同一个网络中
  3. There are useless configs which seem to be copypasted from some other example.有一些无用的配置似乎是从其他示例中复制粘贴的。
  4. You force your containers to restart.您强制您的容器重新启动。 It will be restarting even after the successful exit code.即使在成功退出代码后,它也会重新启动。

So here is the complete example how to run everything:所以这是如何运行所有内容的完整示例:

  1. Create empty folder myfolder and put there app.py with the following content:创建空文件夹myfolder并将app.py放在那里,内容如下:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
driver = webdriver.Remote(command_executor='http://phantomjs:8901/wd/hub/',desired_capabilities=DesiredCapabilities.PHANTOMJS)
  1. Put requirements.txt file to myfolder将 requirements.txt 文件放入myfolder
  2. Put following Dockerfile to myfolder :将以下Dockerfile放入myfolder
FROM python:latest

WORKDIR /configs
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
  1. Put following docker-compose.yml to myfolder :将以下docker-compose.yml放入myfolder
version: '3.1'

services:

  tgbot:
    build: .
    container_name: bot
    volumes:
      - .:/apps
    command: python /apps/app.py
    depends_on:
      - phantomjs
    networks:
      - botnet

  phantomjs:
    container_name: phantomjs
    image: shufo/phantomjs
    command: --webdriver 8901
    networks:
      - botnet

networks:
  botnet:
    driver: bridge
  1. cd myfloder , docker-compose up cd myfloder , cd myfloder docker-compose up

Output:输出:

phantomjs    | [INFO  - 2020-11-10T15:18:11.049Z] GhostDriver - Main - running on port 8901
phantomjs    | [INFO  - 2020-11-10T15:18:11.425Z] Session [f2091fe0-2367-11eb-bcd7-956b9cd40e54] - page.settings - {"XSSAuditingEnabled":false,"javascriptCanCloseWindows":true,"javascriptCanOpenWindows":true,"javascriptEnabled":true,"loadImages":true,"localToRemoteUrlAccessEnabled":false,"userAgent":"Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1","webSecurityEnabled":true}
phantomjs    | [INFO  - 2020-11-10T15:18:11.425Z] Session [f2091fe0-2367-11eb-bcd7-956b9cd40e54] - page.customHeaders:  - {}
phantomjs    | [INFO  - 2020-11-10T15:18:11.425Z] Session [f2091fe0-2367-11eb-bcd7-956b9cd40e54] - Session.negotiatedCapabilities - {"browserName":"phantomjs","version":"2.1.1","driverName":"ghostdriver","driverVersion":"1.2.0","platform":"linux-unknown-64bit","javascriptEnabled":true,"takesScreenshot":true,"handlesAlerts":false,"databaseEnabled":false,"locationContextEnabled":false,"applicationCacheEnabled":false,"browserConnectionEnabled":false,"cssSelectorsEnabled":true,"webStorageEnabled":false,"rotatable":false,"acceptSslCerts":false,"nativeEvents":true,"proxy":{"proxyType":"direct"}}
phantomjs    | [INFO  - 2020-11-10T15:18:11.425Z] SessionManagerReqHand - _postNewSessionCommand - New Session Created: f2091fe0-2367-11eb-bcd7-956b9cd40e54
bot exited with code 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM