简体   繁体   中英

Run lambda functions in the same lambda code from AWS

I have the following lambda function which is working fine to start the databricks cluster, when invoked. Now, I would like to add another lambda function and run it in sequence say after 60 seconds of interval. I tried it by listing both lambda functions one after the other, but only the last one was executed, and the job failed since the cluster was in TERMINATED state. Can someone please help me in running the job, after the cluster is STARTED.

Lambda for STARTING databricks cluster:

const https = require("https");   
var tokenstr = "token:xxxxxxxxaaaaaabbbbbccccccc";

exports.handler = (event, context, callback) => 
{
     var data = JSON.stringify({
        "cluster_id": "2222-111000-123abcde"
      });

         var start_cluster_options = {
             host: "aaa.cloud.databricks.com",
             port: 443,
             path: "/api/2.0/clusters/start",
             method: "POST",
             // authentication headers
             headers: {
              "Authorization": "Basic " + new Buffer(tokenstr).toString("base64"),
              "Content-Type": "application/json",
              "Content-Length": Buffer.byteLength(data)
             }
          };

          var request = https.request(start_cluster_options, function(res){
            var body = "";

            res.on("data", function(data) {
              body += data;
            });

            res.on("end", function() {
              console.log(body);
            });

            res.on("error", function(e) {
              console.log("Got error: " + e.message);
            });

          });

      request.write(data);
      request.end();
    };

Function to run the databricks job from lambda:

exports.handler = (event, context, callback) => {
     var data = JSON.stringify({
   "job_id": 11111
   });

var run_job_options = {
    host: "aaa.cloud.databricks.com",
      port: 443,
      path: "/api/2.0/jobs/run-now",
      method: "POST",
      // authentication headers
      headers: {
       "Authorization": "Basic " + new Buffer(tokenstr).toString("base64"),
       "Content-Type": "application/json",
       "Content-Length": Buffer.byteLength(data)
     }
   };

   var request = https.request(run_job_options, function(res){
     var body = "";

     res.on("data", function(data) {
       body += data;
     });

I would like to have both START / RUN_JOB in the same lambda functions, if its not the better approach, please help me, am new to LAMBDA invocations.

UPDATE:

I have modified my code as suggested by @Dudemullet, and getting an error message " 2018-08-15T22:28:14.446Z 7dfe42ff-a0da-11e8-9e71-f77e93d8a2f8 Task timed out after 3.00 seconds ", not sure, what am I doing wrong, please help.

const https = require("https");
var tokenstr = "token:xxxxxxxxaaaaaabbbbbccccccc";

 var data = JSON.stringify({
    "cluster_id": "2222-111000-123abcde"
  });

 var data2 = JSON.stringify({
   "job_id": 11111
 });

  var start_cluster_options = {
     host: "aaa.cloud.databricks.com",
     port: 443,
     path: "/api/2.0/clusters/start",
     method: "POST",
     // authentication headers
     headers: {
      "Authorization": "Basic " + new Buffer(tokenstr).toString("base64"),
      "Content-Type": "application/json",
      "Content-Length": Buffer.byteLength(data)
     }
  };

 var run_job_options = {
     host: "aaa.cloud.databricks.com",
     port: 443,
     path: "/api/2.0/jobs/run-now",
     method: "POST",
     // authentication headers
     headers: {
      "Authorization": "Basic " + new Buffer(tokenstr).toString("base64"),
      "Content-Type": "application/json",
      "Content-Length": Buffer.byteLength(data2)
    }
  };

exports.handler = (event, context, callback) => 
{
   https.request(start_cluster_options, function(res){});
   setTimeout(() => {
    https.request(run_job_options, function(res){});
    callback(); // notify lambda everything is complete
    }, 60);
};

I do lambda functions in python, but this function, I am extending from a sample, so I'm not sure on node.js coding.

****** END OF UPDATE ******

Ideally I would like to have it within the AWS lambda, not going into AWS Step functions, etc.

Thanks

You can do this with AWS Step Functions . It is basically like a workflow.

At a high level, this is what you may want to do:

1) Run your lambda to start the cluster and return cluster id or something.
2) Check cluster status every 10 seconds.
3) If the cluster is up, execute `submit job` lambda function.

Lets say you have this abstracted down to two functions.

startServer and runJob

Your lambda will run until you call the callback or the execution time (TTL) has expired. So you could write code that looked like this.

exports.handler = (event, context, callback) => {

  https.request(start_cluster_options, function (res) {

    setTimeout(() => {
      https.request(run_job_options, function (res) {

        callback();

      });
    }, 60);

  });
};

Another easy way of doing this is with SQS. Lambdas can now use SQS as an event source. So you could create a message in an SQS queue and set its visibility timeout to whatever time you need. Sqs visibility timeout

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM