I am trying to pass a website as a parameter. It works if the website does not have a "/" in it. For example: http://192.168.1.156:2434/www.cookinglight.com scrapes cooking light for all the images on it's page; however, if I pass in http://192.168.1.156:2434/https://www.cookinglight.com/recipes/chicken-apple-butternut-squash-soup then an I get an invalid response. Here is my current code:
import json
from flask import Flask, render_template
from imagescraper import image_scraper
app = Flask(__name__)
@app.route("/", methods = ['GET'])
def home():
return render_template('index.html')
@app.route("/<site>", methods = ['GET'])
def get_image(site):
return json.dumps(image_scraper(site))
if __name__ == '__main__':
app.run(host='0.0.0.0', port=2434, debug=True)
import requests
from bs4 import BeautifulSoup
def image_scraper(site):
"""scrapes user inputed url for all images on a website and
:param http url ex. https://www.cookinglight.com
:return dictionary key:alt text; value: source link"""
search = site.strip()
search = search.replace(' ', '+')
website = 'https://' + search
response = requests.get(website)
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img')
# create dictionary to add image alt tag and source link
images = {}
for img in img_tags:
try:
name = img['alt']
link = img['src']
images[name] = link
except:
pass
return images
I tried urrllib but did not have any success. Any help would be greatly appreciated! I am a student so still learning!!
Flask
uses /
as separate between arguments - so you can create route("/<arg1>/<arg2>/<arg3>")
and get value in variables arg1
, arg2
, arg3
- and when you try to use url with /
then it try to find route like route("/<arg1>/<arg2>/<arg3>")
If you want to use /
as part of single argument, not as separator between arguments then you need <path:site>
.
from flask import Flask
app = Flask(__name__)
@app.route("/")
def home():
return "Hello World"
@app.route("/<path:site>")
def get_image(site):
return f"OK: {site}"
if __name__ == '__main__':
app.run(host='0.0.0.0', port=2434)#, debug=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.