简体   繁体   English

在Ubuntu中不使用Chrome驱动程序和Selenium的Chrome浏览器自动化

[英]Chrome Browser Automation without using Chrome Driver and Selenium in Ubuntu

I am currently trying to automate Chrome Browser (Not Chrome Driver) on Ubuntu for saving the thousands of pages without Chrome Driver and Selenium which are somehow prohibited by the site. 我目前正在尝试在Ubuntu上自动执行Chrome浏览器(不是Chrome驱动程序),以保存成千上万的页面,而这些站点却以某种方式禁止了Chrome驱动程序和Selenium。

In Mac OS, AppScript can handle Chrome without Chrome Driver and Selenium. 在Mac OS中,AppScript可以在没有Chrome驱动程序和Selenium的情况下处理Chrome。 And I succeed in automation of the downloading the page. 而且我成功实现了页面下载的自动化。 However, I didn't find the alternatives of AppScript in Ubuntu. 但是,我在Ubuntu中找不到AppScript的替代方案。

Thus, I use keyborad automation tool (xdotool) by referring automate-save-page-as . 因此,我通过引用automate-save-page-as使用keyborad自动化工具(xdotool)。 It enables me to open a single page and save it to the storage, but it's too slow, unstable and hard to understand the code. 它使我可以打开一个页面并将其保存到存储中,但是它太慢,不稳定并且难以理解代码。

Is there any plausible way to automate chrome browser without using Selenium and Chrome Driver in Ubuntu? 在Ubuntu中无需使用Selenium和Chrome驱动程序,是否有任何可行的方式自动执行chrome浏览器? Or could any one give some hints to open multiple pages at the same time using xdotool and save it to local after a few seconds? 还是有人建议使用xdotool同时打开多个页面并将其保存到本地几秒钟?

I implement the solution for this problem. 我为这个问题实施了解决方案。 Check "ubuntu_automation_example_multiple.py". 检查“ ubuntu_automation_example_multiple.py”。

https://github.com/jonghkim/browser-automation-beyond-firewall https://github.com/jonghkim/browser-automation-beyond-firewall

I write two essential script files that are "save_page_as_multiple_open" and "save_page_as_multiple_save" by referring automate-save-page-as . 我通过引用automate-save-page-as编写了两个基本的脚本文件,分别是“ save_page_as_multiple_open”和“ save_page_as_multiple_save”。

#-*- coding: utf-8 -*-
import os
import warnings
import time
warnings.filterwarnings('ignore')

def trick_open(url, fname):
    cmd = "./save_page_as_multiple_open '{}' --destination '{}'".format(url, fname)
    os.system(cmd)

def trick_save(url, fname):
    cmd = "./save_page_as_multiple_save '{}' --destination '{}'".format(url, fname)
    os.system(cmd)

if __name__ == "__main__":
    url = 'https://www.example.com'
    cwd = os.getcwd()

    for i in range(5):
        trick_open(url, cwd + "/example{}.html".format(i))

    time.sleep(5)

    for i in reversed(range(5)):
        print("Save Path: ", cwd + "/example{}.html".format(i))
        trick_save(url, cwd + "/example{}.html".format(i))

    cmd = "killall google-chrome"
    os.system(cmd)

In "save_page_as_multiple_open", it opens multiple url using xdotool. 在“ save_page_as_multiple_open”中,它使用xdotool打开多个URL。 After then, "save_page_as_mutiple_save" saves each page and close the page in reverse order. 之后,“ save_page_as_mutiple_save”将保存每个页面并以相反的顺序关闭该页面。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将Chrome Selenium驱动程序添加到Ubuntu - Adding Chrome Selenium Driver to Ubuntu 使用Selenium和Chrome Dev Tools进行浏览器内存泄漏自动化 - Browser memory leak automation using Selenium and Chrome Dev Tools 在使用硒自动化时,单击Chrome浏览器的元素会出现问题吗? - Clicking an element for Chrome browser an issue when using selenium automation? 为什么在 Ubuntu 18.04 上运行 Selenium chrome 驱动程序时 chrome 浏览器无法启动 - Why chrome browser doesn't launch when running Selenium chrome driver on Ubuntu 18.04 没有缓存的硒铬驱动程序 - selenium chrome driver without cache 使用selenium web驱动程序加载chrome浏览器的问题 - Problems with loading chrome browser using selenium web driver 使用Selenium的无头铬(Ubuntu)出现“无法获得自动化扩展”错误 - “Cannot get automation extension” error on headless chrome with Selenium (Ubuntu) 使用Chrome驱动程序在不含chrome的build agent上进行硒测试 - Selenium tests with Chrome driver on build agent without chrome 将Selenium chrome驱动程序启动的Chrome浏览器与手动启动的Chrome浏览器区分开来 - Differentiating Chrome browser launched by Selenium chrome driver from manually launched Chrome Browser 在Google Chrome Selenium驱动程序中使用扩展 - Using extensions with Google Chrome Selenium driver
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM