简体   繁体   中英

Use Beautiful Soup to scrape all questions a person has answered on Quora

How would I program beautiful soup to scrape all questions a specific user has replied to?

URL of Author
example: https://www.quora.com/profile/AUTHOR/answers )

Column 1: Question the author has answered
example: "Lorem Ipsum Question"

Column 2: URL of the answered question
example: https://www.quora.com/lorem-ipsum-question

Column 3: URL of the answered question
example: https://www.quora.com/lorem-ipsum-question

This script will print all answers/url found on the page. There's also infinite scrolling that's making POST requests to https://www.quora.com/graphql/gql_para_POST?q=UserProfileAnswersMostRecent_RecentAnswers_Query but I couldn't manage to get the data from it (you can see it in Developer tools -> network tab):

import re
import json
import requests

url = 'https://www.quora.com/profile/Nana-Bello-Shehu/answers'
html_data = requests.get(url).text

d = re.findall(r'window\.ansFrontendGlobals\.data\.inlineQueryResults\.results\[".*?"\] = ("{.*}");', html_data)[-1]
d = json.loads(json.loads(d));

for e in d['data']['user']['recentPublicAndPinnedAnswersConnection']['edges']:
    if e['node']['__typename'] != 'Answer':

    q = json.loads(e['node']['question']['title'])
    title = q['sections'][0]['spans'][0]['text']
    u = 'https://www.quora.com' + e['node']['question']['url']
    print('{:<90} {}'.format(title, u))


Do pictures speak louder than words?                                                       https://www.quora.com/Do-pictures-speak-louder-than-words
Does true love exist?                                                                      https://www.quora.com/Does-true-love-exist-8
What picture made your blood boil?                                                         https://www.quora.com/What-picture-made-your-blood-boil
What are the before and after pics of people who are drug addicts for several years?       https://www.quora.com/What-are-the-before-and-after-pics-of-people-who-are-drug-addicts-for-several-years
What was the funniest thing you saw/heard today?                                           https://www.quora.com/What-was-the-funniest-thing-you-saw-heard-today
Are there any truly selfless acts, motives, or people?                                     https://www.quora.com/Are-there-any-truly-selfless-acts-motives-or-people
Which famous person in history who is idolized, was actually a horrible person?            https://www.quora.com/Which-famous-person-in-history-who-is-idolized-was-actually-a-horrible-person
What is something that you read recently and is worth sharing?                             https://www.quora.com/What-is-something-that-you-read-recently-and-is-worth-sharing
How do I get the attention of my crush?                                                    https://www.quora.com/How-do-I-get-the-attention-of-my-crush
What are some heart touching stories of best friends?                                      https://www.quora.com/What-are-some-heart-touching-stories-of-best-friends

The easiest way for you i think is with selenium:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox(executable_path='c:/program/geckodriver.exe')
import time
url = 'https://www.quora.com/profile/Nana-Bello-Shehu/answers'



last_height = driver.execute_script("return document.body.scrollHeight")

while True:

    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")


    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
    last_height = new_height

qbox = driver.find_elements_by_css_selector('.qu-pb--medium')
for qb in qbox:
    print('https://www.quora.com' + qb.find_element_by_css_selector('a.q-box.qu-cursor--pointer.qu-hover--textDecoration--underline').get_attribute('href'))


Do pictures speak louder than words?

Does true love exist?

What picture made your blood boil?

What are the before and after pics of people who are drug addicts for several years?

What was the funniest thing you saw/heard today?

Are there any truly selfless acts, motives, or people?

And so on...

This script scroll to the end of the page and copy all questions. You can try to set lower SCROLL_TIME to make the script faster but sometimes the script will end before end of the page with shorter scroll time.


  1. You need selenium
  2. You need Firefox
  3. You need geckodriver and now the script import it from c:/program/geckodriver.exe so if you add geckodriver to an other path you need to change the executable_path

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM