[英]How to run PhantomJS as a server and call it remotely?
This is probably a very basic question. 这可能是一个非常基本的问题。 I would like to run a headless browser
PhantomJS
as a server but not as a command line tool. 我想运行无头浏览器
PhantomJS
作为服务器,但不是作为命令行工具。
Once it is running I would like to call it remotely over HTTP. 一旦它运行,我想通过HTTP远程调用它。 The only thing I need is to send a URL and get back the HTML output.
我唯一需要的是发送一个URL并获取HTML输出。 I need it to generate HTML for an AJAX application to make it searchable.
我需要它为AJAX应用程序生成HTML以使其可搜索。
Is it possible ? 可能吗 ?
You can run PhantomJS perfectly fine as a webserver, because it has the Web Server Module . 您可以将PhantomJS完美地运行为Web服务器,因为它具有Web服务器模块 。 The examples folder contains for example a server.js example .
examples文件夹包含例如server.js示例 。 This runs standalone without any dependencies (without node).
这是独立运行的,没有任何依赖(没有节点)。
var page = require('webpage').create(),
server = require('webserver').create();
var service = server.listen(port, function (request, response) {
console.log('Request received at ' + new Date());
// TODO: parse `request` and determine where to go
page.open(someUrl, function (status) {
if (status !== 'success') {
console.log('Unable to post!');
} else {
response.statusCode = 200;
response.headers = {
'Cache': 'no-cache',
'Content-Type': 'text/plain;charset=utf-8'
};
// TODO: do something on the page and generate `result`
response.write(result);
response.close();
}
});
});
If you want to run PhantomJS through node.js then this is also easily doable using the phantomjs-node which is a PhantomJS bridge for node. 如果你想通过node.js运行PhantomJS,那么使用phantomjs-node(它是节点的PhantomJS桥)也很容易实现。
var http = require('http');
var phantom = require('phantom');
phantom.create(function (ph) {
ph.createPage(function (page) {
http.createServer(function (req, res) {
// TODO: parse `request` and determine where to go
page.open(someURL, function (status) {
res.writeHead(200, {'Content-Type': 'text/plain'});
// TODO: do something on the page and generate `result`
res.end(result);
});
}).listen(8080);
});
});
You can freely use this as is as long you don't have multiple requests at the same time. 只要您没有多个请求同时,您就可以自由使用它。 If you do, then you either need to synchronize the requests (because there is only one
page
object) or you need to create a new page
object on every request and close()
it again when you're done. 如果这样做,那么您需要同步请求(因为只有一个
page
对象),或者您需要在每个请求上创建一个新的page
对象,并在完成后再次close()
它。
The easiest way is to make a python script or something simple to start the server and use python websockets to communicate with it, using a web form of sorts to query for a website and get the page source. 最简单的方法是创建一个python脚本或简单的东西来启动服务器并使用python websockets与它进行通信,使用各种Web表单来查询网站并获取页面源。 Any automation can be done via cron jobs, or if you are on Windows, you may use the Tasks feature to autostart the python script.
任何自动化都可以通过cron作业完成,或者如果你在Windows上,你可以使用Tasks功能自动启动python脚本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.