简体   繁体   中英

Run scrapy from Flask application

I have a crawler which I want to run everytime a person goes to the link. Since all the other modules are in Flask, I was told to build this in Flask also. I have installed scrapy and selenium both in the virtual environment and globally on the machine with root.

When I run the crawler through the terminal, everything works fine. When I start the Flask application and visit xx.xx.xx.xx:8080/whats in the browser, this also works fine and runs my crawler and gets me the file. But as soon as I go live so that anytime a person goes to the link, it gives me internal error in browser.

In order to run crawler, we have to type "scrapy crawl whateverthespidernameis" in the terminal. I did this using Python's os module.

Here is my flask code:

import sys
from flask import request, jsonify, render_template, url_for, redirect,   session, abort,render_template_string,send_file,send_from_directory
from flask import *
#from application1 import *
from main import *
from test123 import *
import os
app = Flask(__name__)

filename = ''
app = Flask(__name__)

@app.route('/whats')
def whats():
    os.getcwd()
    os.chdir("/var/www/myapp/whats")
    //cmd = "scrapy crawl whats"
    cmd = "sudo scrapy crawl whats"
    os.system(cmd)
    return send_file("/var/www/myapp/staticcsv/whats.csv", as_attachment =True)

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=8080,debug=True)

This is the error recorded in the log file when I run through live link:

sh: 1: scrapy: not found**

This is the error recorded in the log file when I use sudo in the command (variable cmd ):

sudo: no tty present and no askpass program specified**

I am using uwsgi and nginx.

How can I run this crawler so that when anyone goes to "xx.xx.xx.xx/whats" the crawler runs and returns the csv file?

When you use sudo the shell this starts will ask for a password on the tty - it specifically doesn't read standard input for this information. Since flask and other web applications typically run detached from a terminal, sudo has no way to ask for a password, so it looks for a program that can provide the password. You can find more information on this topic in this answer .

The reason you aren't finding scrapy is most likely because of differences in your $PATH between the interactive shells you used in testing and the process that's running flask . The easiest way to get around this is to give the full path to the scrapy program in your command.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM