简体   繁体   English

如何在AWS Lambda上部署phantomjs节点应用程序?

[英]How to deploy a phantomjs node app on AWS Lambda?

I threw together a small Lambda function together to crawl a website using the SpookyJS, CasperJS, and PhantomJS toolchain for headless browsing. 我将一个小的Lambda函数组合在一起,以使用SpookyJS,CasperJS和PhantomJS工具链对网站进行爬网以进行无头浏览。 The task is quite simple, and at some point a few months ago it was working in Lambda. 这项任务非常简单,几个月前的某个时候,它正在Lambda上工作。 I recently had to change a few things around and wanted to work on the project again, but started fresh and had trouble getting Lambda to run without erroring in any capacity. 我最近不得不改变一些事情,想再次进行该项目,但是刚开始时又遇到了让Lambda运行而又不出错的问题。 My question is how can I run phantomjs in Lambda ? 我的问题是如何在Lambda中运行phantomjs

The example code I am running is: 我正在运行的示例代码是:

spooky.start('http://en.wikipedia.org/wiki/Spooky_the_Tuff_Little_Ghost');
spooky.then(function () {
    this.emit('hello', 'Hello, from ' + this.evaluate(function () {
        return document.title;
    }));
});
spooky.run();

The error I am getting in Lambda is: 我在Lambda中遇到的错误是:

{ [Error: Child terminated with non-zero exit code 1] details: { code: 1, signal: null } }

I have followed a variety of procedures to ensure everything is able to run on Lambda. 我遵循了各种程序来确保所有内容都可以在Lambda上运行。 Below is a long list of things I've attempted to diagnose: 以下是我尝试诊断的一长串内容:

  1. Run locally using node index.js and confirm it is working 使用node index.js在本地运行并确认其正常运行
  2. Upload package.json and the js file to an Amazon Linux EC2 instance for compilation as recommended for npm installation calls and described here 将package.json和js文件上传到Amazon Linux EC2实例,以进行npm安装调用的建议进行编译,并在此处进行了描述
  3. Run npm install on the ec2 instance, and again run node index.js to ensure the correct output 在ec2实例上运行npm install ,然后再次运行node index.js以确保输出正确
  4. zip everything up, and deploy to AWS using the cli 压缩所有内容,然后使用cli部署到AWS

My package.json is: 我的package.json是:

{
  "name": "lambda-spooky-test",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "casperjs": "^1.1.3",
    "phantomjs-prebuilt": "^2.1.10",
    "spooky": "^0.2.5"
  }
}

I have also attempted the following (most also working locally, and on the AWS EC2 instance, but with the same error on Lambda: 我也尝试了以下方法(大多数方法也在本地和AWS EC2实例上工作,但是在Lambda上出现相同的错误:

  1. Trying the non -prebuilt version of phantom 尝试幻影的非预构建版本
  2. Ensuring casperjs and phantomjs are accessible from the path with process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH ); 确保可以使用process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH );从路径访问casperjs和phantomjs process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH ); process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/.bin'; console.log( 'PATH: ' + process.env.PATH );
  3. Inspecting spawn calls by wrapping child_process's .spawn() call, and got the following: 通过包装child_process的.spawn()调用来检查生成调用,并获得以下信息:

     { '0': 'casperjs', '1': [ '/var/task/node_modules/spooky/lib/bootstrap.js', '--transport=http', '--command=casperjs', '--port=8081', '--spooky_lib=/var/task/node_modules/spooky/lib/../', '--spawnOptions=[object Object]' ], '2': {} } 
  4. Calling .exec('casperjs') and .exec('phantomjs --version') directly, confirming it works locally and on EC2, but gets the following error in Lambda. 直接调用.exec('casperjs').exec('phantomjs --version') ,确认它在本地和EC2上均有效,但在Lambda中出现以下错误。 The command: 命令:

     `require('child_process').exec('casperjs', (error, stdout, stderr) => { if (error) { console.error('error: ' + error); } console.log('out: ' + stdout); console.log('err: ' + stderr); }); 

both with the following result: 两者都具有以下结果:

err: Error: Command failed: /bin/sh -c casperjs
module.js:327
    throw err;
    ^

Error: Cannot find module '/var/task/node_modules/lib/phantomjs'
    at Function.Module._resolveFilename (module.js:325:15)
    at Function.Module._load (module.js:276:25)
    at Module.require (module.js:353:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/var/task/node_modules/.bin/phantomjs:16:15)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
    at Module.load (module.js:343:32)
    at Function.Module._load (module.js:300:12)
    at Function.Module.runMain (module.js:441:10)

2016-08-07T15:36:37.349Z    b9a1b509-5cb4-11e6-ae82-256a0a2817b9    sout: 
2016-08-07T15:36:37.349Z    b9a1b509-5cb4-11e6-ae82-256a0a2817b9    serr: module.js:327
    throw err;
    ^

Error: Cannot find module '/var/task/node_modules/lib/phantomjs'
    at Function.Module._resolveFilename (module.js:325:15)
    at Function.Module._load (module.js:276:25)
    at Module.require (module.js:353:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/var/task/node_modules/.bin/phantomjs:16:15)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
    at Module.load (module.js:343:32)
    at Function.Module._load (module.js:300:12)
    at Function.Module.runMain (module.js:441:10)

I found the issue to be that including the node_modules/.bin in the path works on both local and ec2 machines because those files simply point to the action /bin folders in each respective library. 我发现问题在于,在路径中包含node_modules/.bin在本地计算机和ec2计算机上均有效,因为这些文件仅指向每个各自库中的action /bin文件夹。 This breaks if calls within those files use relative paths. 如果这些文件中的调用使用相对路径,则会中断。 The issue: 问题:

[ec2-user@ip-172-31-32-87 .bin]$ ls -lrt
total 0
lrwxrwxrwx 1 ec2-user ec2-user 35 Aug  7 00:52 phantomjs -> ../phantomjs-prebuilt/bin/phantomjs
lrwxrwxrwx 1 ec2-user ec2-user 24 Aug  7 00:52 casperjs -> ../casperjs/bin/casperjs

I worked around this by adding each library's respective bin to the lambda path in the Lambda handler function: 我通过在Lambda处理函数中将每个库的相应bin添加到lambda路径来解决此问题:

process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] 
        + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/phantomjs-prebuilt/bin'
        + ':' + process.env['LAMBDA_TASK_ROOT'] + '/node_modules/casperjs/bin';

And this will now run phantom, casper, and spooky correctly in Lambda. 现在,这将在Lambda中正确运行幻象,卡斯珀和怪异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM