[英]How to use Selenium on Colaboratory Google?
I scraping many info from web, and I hope it works on cloud.我从网上抓取了很多信息,我希望它可以在云上运行。 So I'd like to use colaboratory, but it turned error
所以我想使用 colaboratory,但它变成了错误
WebDriverException Traceback (most recent call last)
<ipython-input-35-abcc3b93dfa7> in <module>()
20 options.add_argument("--start-maximized");
21 options.add_argument("--headless");
---> 22 driver = webdriver.Chrome('chromedriver', chrome_options=options)
23
24 book = cd + "/target.xlsx"
/usr/local/lib/python3.6/dist-packages/selenium/webdriver/chrome/webdriver.py in __init__(self, executable_path, port, options, service_args, desired_capabilities, service_log_path, chrome_options, keep_alive)
71 service_args=service_args,
72 log_path=service_log_path)
---> 73 self.service.start()
74
75 try:
/usr/local/lib/python3.6/dist-packages/selenium/webdriver/common/service.py in start(self)
96 count = 0
97 while True:
---> 98 self.assert_process_still_running()
99 if self.is_connectable():
100 break
/usr/local/lib/python3.6/dist-packages/selenium/webdriver/common/service.py in assert_process_still_running(self)
109 raise WebDriverException(
110 'Service %s unexpectedly exited. Status code was: %s'
--> 111 % (self.path, return_code)
112 )
113
WebDriverException: Message: Service chromedriver unexpectedly exited. Status code was: -6
I read the articles, and it says this works.我阅读了文章,它说这是有效的。 How can we use Selenium Webdriver in colab.research.google.com?
我们如何在 colab.research.google.com 中使用 Selenium Webdriver? But actually not.
但实际上不是。
Any Ideas are appreciated.任何想法表示赞赏。
My option is我的选择是
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('chromedriver', chrome_options=options)
↑ this last sentence makes error ↑ 最后一句话有误
WebDriverException: Message: Service chromedriver unexpectedly exited. Status code was: -6
============================================ My entire chart ============================================ 我的整个图表
!sudo apt install unzip
!wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
!unzip chromedriver_linux64.zip -d /usr/bin/
from google.colab import drive
drive.mount('/content/drive')
!pip install selenium
!pip install openpyxl
then, python script is然后,python脚本是
cd = "drive/My Drive/doc/業務資料/イーコレ/scrap/*"
import os, subprocess
import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
import selenium
import bs4
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup
import openpyxl
import time, re, csv, urllib.parse
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('chromedriver', chrome_options=options)
# install chromium, its driver, and selenium
!apt update
!apt install chromium-chromedriver
!pip install selenium
# set options to be headless, ..
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
# open it, go to a website, and get results
wd = webdriver.Chrome('chromedriver',options=options)
wd.get("https://www.website.com")
print(wd.page_source) # results
I wrap this all into a library我把这一切都包装成一个图书馆
!pip install kora
from kora.selenium import wd
I think this code would work:我认为这段代码会起作用:
!sudo apt install unzip
!wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
#!unzip chromedriver_linux64.zip -d /usr/bin/
from google.colab import drive
!pip install selenium
!pip install openpyxl
!apt-get update
!apt-get install -y unzip xvfb libxi6 libgconf-2-4
!apt-get install default-jdk
cd = "drive/My Drive/doc/業務資料/イーコレ/scrap/*"
import os, subprocess
import sys
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
import selenium
import bs4
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup
import openpyxl
import time, re, csv, urllib.parse
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('chromedriver', chrome_options=options)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.