简体   繁体   English

如何将PhantomJS作为服务器运行并远程调用?

[英]How to run PhantomJS as a server and call it remotely?

This is probably a very basic question. 这可能是一个非常基本的问题。 I would like to run a headless browser PhantomJS as a server but not as a command line tool. 我想运行无头浏览器PhantomJS作为服务器,但不是作为命令行工具。

Once it is running I would like to call it remotely over HTTP. 一旦它运行,我想通过HTTP远程调用它。 The only thing I need is to send a URL and get back the HTML output. 我唯一需要的是发送一个URL并获取HTML输出。 I need it to generate HTML for an AJAX application to make it searchable. 我需要它为AJAX应用程序生成HTML以使其可搜索。

Is it possible ? 可能吗 ?

You can run PhantomJS perfectly fine as a webserver, because it has the Web Server Module . 您可以将PhantomJS完美地运行为Web服务器,因为它具有Web服务器模块 The examples folder contains for example a server.js example . examples文件夹包含例如server.js示例 This runs standalone without any dependencies (without node). 这是独立运行的,没有任何依赖(没有节点)。

var page = require('webpage').create(),
    server = require('webserver').create();

var service = server.listen(port, function (request, response) {
    console.log('Request received at ' + new Date());
    // TODO: parse `request` and determine where to go
    page.open(someUrl, function (status) {
        if (status !== 'success') {
            console.log('Unable to post!');
        } else {
            response.statusCode = 200;
            response.headers = {
                'Cache': 'no-cache',
                'Content-Type': 'text/plain;charset=utf-8'
            };
            // TODO: do something on the page and generate `result`
            response.write(result);
            response.close();
        }
    });
});

If you want to run PhantomJS through node.js then this is also easily doable using the phantomjs-node which is a PhantomJS bridge for node. 如果你想通过node.js运行PhantomJS,那么使用phantomjs-node(它是节点的PhantomJS桥)也很容易实现。

var http = require('http');
var phantom = require('phantom');

phantom.create(function (ph) {
  ph.createPage(function (page) {
    http.createServer(function (req, res) {
      // TODO: parse `request` and determine where to go
      page.open(someURL, function (status) {
        res.writeHead(200, {'Content-Type': 'text/plain'});
        // TODO: do something on the page and generate `result`
        res.end(result);
      });
    }).listen(8080);
  });
});

Notes 笔记

You can freely use this as is as long you don't have multiple requests at the same time. 只要您没有多个请求同时,您就可以自由使用它。 If you do, then you either need to synchronize the requests (because there is only one page object) or you need to create a new page object on every request and close() it again when you're done. 如果这样做,那么您需要同步请求(因为只有一个page对象),或者您需要在每个请求上创建一个新的page对象,并在完成后再次close()它。

The easiest way is to make a python script or something simple to start the server and use python websockets to communicate with it, using a web form of sorts to query for a website and get the page source. 最简单的方法是创建一个python脚本或简单的东西来启动服务器并使用python websockets与它进行通信,使用各种Web表单来查询网站并获取页面源。 Any automation can be done via cron jobs, or if you are on Windows, you may use the Tasks feature to autostart the python script. 任何自动化都可以通过cron作业完成,或者如果你在Windows上,你可以使用Tasks功能自动启动python脚本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM