简体   繁体   English

如何使用beautifulsoup提取锚类?

[英]How to extract anchor class using beautifulsoup?

Can anyone help me to resolve the error.谁能帮我解决这个错误。 My head is paining, I have wasted my 1 hour in solving this problem.我的头很痛,我浪费了 1 个小时来解决这个问题。 Actually I am getting query_links as null(I should get all the classes values), but not so.实际上我得到的 query_links 为空(我应该得到所有的类值),但事实并非如此。

from flask import Flask, jsonify, request
# from markupsafe import escape
from bs4 import BeautifulSoup
import requests

app = Flask(__name__)

@app.route('/api/',methods=['GET'])
def API():
    if request.method == 'GET':
        url = 'https://www.brainyquote.com/'
        query = str(request.args['query'])

        if " " in query:
            query = str(query).replace(" ","+")
        else:
            pass

        search = '/search_results?q=' + query
        ready_url = url + search
        content = requests.get(ready_url).content
        soup =  BeautifulSoup(content, 'html.parser')
        quotes_links = soup.find_all("a", class_= "b-qt")
        print("hello")
        print(quotes_links)
        list = []
        for i in quotes_links:
            d = {}
            quote_url = url + i.get('href')
            quote_content = requests.get(quote_url).content
            quote_soup = BeautifulSoup(quote_content, 'html.parser')
            d['quote'] = quote_soup.find('p', class_= "b-qt").text
            d['author'] = str(quote_soup.find('p', class_= "bq-aut").text).strip()
            list.append(d)
            

        return jsonify(list)

if __name__ == "__main__":
    app.run(debug=True)

Please help me.请帮我。 Why I am not getting any value in json.为什么我在 json 中没有得到任何价值。 My list is empty.我的清单是空的。 And also Query_links is null.而且 Query_links 也是空的。 Is there any syntax mistake or anything else?是否有任何语法错误或其他任何东西?

Your ready_url variable ends up having a double slash in it (ie https://www.brainyquote.com//search_results?q=testing ).您的ready_url变量最终有一个双斜杠(即https://www.brainyquote.com//search_results?q=testing )。 If you test that in a browser or with curl, you'll see that yields no results.如果您在浏览器中或使用 curl 进行测试,您将看到不会产生任何结果。 If you fix the definition of url so it doesn't have the trailing slash (ie url='https://www.brainyquote.com' ), your code will work.如果您修复url的定义,使其没有尾部斜杠(即url='https://www.brainyquote.com' ),您的代码将起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM