简体   繁体   English

什么称为允许模拟浏览器以实现自动化目的的库?

[英]What is called a library that allows emulation of a browser for automating purposes?

I have an automating task in which I need to fill several forms in a site with data from word documents. 我有一个自动化任务,我需要使用word文档中的数据填充站点中的多个表单。 For that I would need a library that emulates a browser and allows me to programatically enter a site and access html elements. 为此我需要一个模拟浏览器的库,允许我以编程方式输入一个站点并访问html元素。 What is this called? 这个叫什么? Are there examples of libraries that do this for python or clojure? 是否存在为python或clojure执行此操作的库的示例?

You have a couple of choices: 你有几个选择:

  1. Mechanize 机械化
  2. Selenium

There are others too, but I can't remember them off the top of my head right now (will post as and when I remember more) 还有其他人,但我现在不能记住它们(当我记得更多的时候会发布)

You may wanna take a look at PhantomJS too: 你可能也想看看PhantomJS

PhantomJS is a headless WebKit with JavaScript API. PhantomJS是一个带有JavaScript API的无头WebKit。 It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG. 它具有对各种Web标准的快速和本机支持:DOM处理,CSS选择器,JSON,Canvas和SVG。

If you just want to submit a form, it would probably be easier to forge a request and send it using urllib2 . 如果您只想提交表单,可能更容易伪造请求并使用urllib2发送它。

In nowadays Clojure , http-kit is my favorite. 在现今的Clojure中http-kit是我最喜欢的。 It just makes http interaction very easy. 它只是使http交互非常容易。

; taken from github
(defn on-response [resp]
  ;; {:status 200 :body "....." :headers {:key val :key val}}
  (println resp))

;;; initialize, timeout is 40s, and default user-agent
 (http/init :timeout 40000 :user-agent "http-kit/1.1")

;;; other params :headers :proxy binary? keyify?
(http/get {:url "http://shenfeng.me" :cb on-response})

;;; other params :headers :proxy binary? keyify?
(http/post {:url "http://example/"
        :cb on-response
        :body {"name" "http-kit" "author" "shenfeng"}  :binary? true})

I have also used CasperJs and it just makes any headless browsing possible. 我也使用过CasperJs ,它可以让任何无头浏览成为可能。 Also, you can interact with the client side javascript while automating the browsing. 此外,您可以在自动浏览的同时与客户端javascript交互。 The only draw back I found was that it was slightly harder to integrate all this with existing code, but as a standalone tool it was perfect. 我发现唯一的缺点是将所有这些与现有代码集成起来有点困难,但作为一个独立的工具,它是完美的。 It also supports both coffescript and javascript scripting. 它还支持coffescript和javascript脚本。

Look at the Quickstart to get an idea on how it works. 查看快速入门 ,了解它的工作原理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM