简体   繁体   中英

Launch Spark-Submit with restful service in Python

following this tutorial I've made a restful service in python. Using this service I want to call an other python script with spark-submit , but it doens't work.

Here my service.py :

import pickle
import subprocess
from flask import Flask, request
from flask_restful import Resource, Api
from json import dumps
from flask_jsonpify import jsonify

app = Flask(__name__)
api = Api(app)

class Test(Resource):
    def post(self):
        imageID = request.form.get('imageID')
        tags = request.form.get('tags')

        return subprocess.call("spark-submit NaiveBayesClassifier.py",shell=True,stderr=subprocess.STDOUT)


api.add_resource(Test, '/test') 

if __name__ == '__main__':
    app.run(port=5002)

This service is made with virtualenv started using this:

source venv/bin/activate
python service.py

But when the script run ubprocess.call("spark-submit NaiveBayesClassifier.py",shell=True,stderr=subprocess.STDOUT) it return me this error:

Running on http://127.0.0.1:5002/ (Press CTRL+C to quit)
OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory 
file:
   34475
 Try using the -Djava.io.tmpdir= option to select an alternate temp location.

 OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory file:
   34462
Try using the -Djava.io.tmpdir= option to select an alternate temp location.

Traceback (most recent call last):
  File "/home/usertest/project/NaiveBayesClassifier.py", line 2, in <module>
    import numpy
ImportError: No module named numpy
127.0.0.1 - - [23/Feb/2018 14:36:06] "POST /test HTTP/1.1" 200 -

Any ideas about the problem? I'm using Spark 1.6.1

I see three problems.

First in your code, you should use Popen in this way:

class Test(Resource):
    def post(self):
        imageID = request.form.get('imageID')
        tags = request.form.get('tags')

        p = subprocess.Popen(["spark-submit", "NaiveBayesClassifier.py"], stdout=subprocess.PIPE)
        return p.communicate()

Second in your virtualenv you should install pip

pip install numpy

or using sudo if you get an errors

sudo pip install numpy

Third this message means that there's no more space in your HDD. Try to delete some large file or incrase your partition if you can.

warning: Insufficient space for shared memory file: 34475

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM