[英]How to deploy a phantomjs node app on AWS Lambda?
I threw together a small Lambda function together to crawl a website using the SpookyJS, CasperJS, and PhantomJS toolchain for headless browsing. 我将一个小的Lambda函数组合在一起,以使用SpookyJS,CasperJS和PhantomJS工具链对网站进行爬网以进行无头浏览。 The task is quite simple, and at some point a few months ago it was working in Lambda. 这项任务非常简单,几个月前的某个时候,它正在Lambda上工作。 I recently had to change a few things around and wanted to work on the project again, but started fresh and had trouble getting Lambda to run without erroring in any capacity. 我最近不得不改变一些事情,想再次进行该项目,但是刚开始时又遇到了让Lambda运行而又不出错的问题。 My question is how can I run phantomjs in Lambda ? 我的问题是如何在Lambda中运行phantomjs ?
The example code I am running is: 我正在运行的示例代码是:
spooky.start('http://en.wikipedia.org/wiki/Spooky_the_Tuff_Little_Ghost');
spooky.then(function () {
this.emit('hello', 'Hello, from ' + this.evaluate(function () {
return document.title;
}));
});
spooky.run();
The error I am getting in Lambda is: 我在Lambda中遇到的错误是:
{ [Error: Child terminated with non-zero exit code 1] details: { code: 1, signal: null } }
I have followed a variety of procedures to ensure everything is able to run on Lambda. 我遵循了各种程序来确保所有内容都可以在Lambda上运行。 Below is a long list of things I've attempted to diagnose: 以下是我尝试诊断的一长串内容:
node index.js
and confirm it is working 使用node index.js
在本地运行并确认其正常运行 npm install
on the ec2 instance, and again run node index.js
to ensure the correct output 在ec2实例上运行npm install
,然后再次运行node index.js
以确保输出正确 My package.json is: 我的package.json是:
{
"name": "lambda-spooky-test",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "",
"license": "ISC",
"dependencies": {
"casperjs": "^1.1.3",
"phantomjs-prebuilt": "^2.1.10",
"spooky": "^0.2.5"
}
}
I have also attempted the following (most also working locally, and on the AWS EC2 instance, but with the same error on Lambda: 我也尝试了以下方法(大多数方法也在本地和AWS EC2实例上工作,但是在Lambda上出现相同的错误:
process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH );
确保可以使用process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH );
从路径访问casperjs和phantomjs process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH );
process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH );
Inspecting spawn calls by wrapping child_process's .spawn()
call, and got the following: 通过包装child_process的.spawn()
调用来检查生成调用,并获得以下信息:
{ '0': 'casperjs', '1': [ '/var/task/node_modules/spooky/lib/bootstrap.js', '--transport=http', '--command=casperjs', '--port=8081', '--spooky_lib=/var/task/node_modules/spooky/lib/../', '--spawnOptions=[object Object]' ], '2': {} }
Calling .exec('casperjs')
and .exec('phantomjs --version')
directly, confirming it works locally and on EC2, but gets the following error in Lambda. 直接调用.exec('casperjs')
和.exec('phantomjs --version')
,确认它在本地和EC2上均有效,但在Lambda中出现以下错误。 The command: 命令:
`require('child_process').exec('casperjs', (error, stdout, stderr) => { if (error) { console.error('error: ' + error); } console.log('out: ' + stdout); console.log('err: ' + stderr); });
both with the following result: 两者都具有以下结果:
err: Error: Command failed: /bin/sh -c casperjs
module.js:327
throw err;
^
Error: Cannot find module '/var/task/node_modules/lib/phantomjs'
at Function.Module._resolveFilename (module.js:325:15)
at Function.Module._load (module.js:276:25)
at Module.require (module.js:353:17)
at require (internal/module.js:12:17)
at Object.<anonymous> (/var/task/node_modules/.bin/phantomjs:16:15)
at Module._compile (module.js:409:26)
at Object.Module._extensions..js (module.js:416:10)
at Module.load (module.js:343:32)
at Function.Module._load (module.js:300:12)
at Function.Module.runMain (module.js:441:10)
2016-08-07T15:36:37.349Z b9a1b509-5cb4-11e6-ae82-256a0a2817b9 sout:
2016-08-07T15:36:37.349Z b9a1b509-5cb4-11e6-ae82-256a0a2817b9 serr: module.js:327
throw err;
^
Error: Cannot find module '/var/task/node_modules/lib/phantomjs'
at Function.Module._resolveFilename (module.js:325:15)
at Function.Module._load (module.js:276:25)
at Module.require (module.js:353:17)
at require (internal/module.js:12:17)
at Object.<anonymous> (/var/task/node_modules/.bin/phantomjs:16:15)
at Module._compile (module.js:409:26)
at Object.Module._extensions..js (module.js:416:10)
at Module.load (module.js:343:32)
at Function.Module._load (module.js:300:12)
at Function.Module.runMain (module.js:441:10)
I found the issue to be that including the node_modules/.bin
in the path works on both local and ec2 machines because those files simply point to the action /bin
folders in each respective library. 我发现问题在于,在路径中包含node_modules/.bin
在本地计算机和ec2计算机上均有效,因为这些文件仅指向每个各自库中的action /bin
文件夹。 This breaks if calls within those files use relative paths. 如果这些文件中的调用使用相对路径,则会中断。 The issue: 问题:
[ec2-user@ip-172-31-32-87 .bin]$ ls -lrt
total 0
lrwxrwxrwx 1 ec2-user ec2-user 35 Aug 7 00:52 phantomjs -> ../phantomjs-prebuilt/bin/phantomjs
lrwxrwxrwx 1 ec2-user ec2-user 24 Aug 7 00:52 casperjs -> ../casperjs/bin/casperjs
I worked around this by adding each library's respective bin to the lambda path in the Lambda handler function: 我通过在Lambda处理函数中将每个库的相应bin添加到lambda路径来解决此问题:
process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT']
+ ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/phantomjs-prebuilt/bin'
+ ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/casperjs/bin';
And this will now run phantom, casper, and spooky correctly in Lambda. 现在,这将在Lambda中正确运行幻象,卡斯珀和怪异。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.