当使用请求和beautifulsoup加载更多内容时，我无法抓取下一页上url不会更改的网站

Question

Python Beatifulsoup 请求

import requests
import re
import os
import csv
from bs4 import BeautifulSoup





for d in searche:
    truelink = d.replace(" ","-")
    truelinkk=('https://www.fb.com

    r = requests.get(truelinkk,headers=headers).text
    soup=BeautifulSoup(r,'lxml')
    mobile=soup.find_all('li',class_='EIR5N')

我是python的初学者。 当使用请求和beautifulsoup 加载更多内容时，我无法抓取网址在下一页上不会更改的网站，请有人访问该网站让我知道使用beautifulsoup 和请求抓取上述网站的程序。 任何答案将不胜感激谢谢请查看此链接https://www.olx.in/hyderabad_g4058526/q-Note-9-max-pro?isSearchCall=true

Answer 1

您可以在无头模式下使用 selenium 而不是requests 。 Eventho selenium 用于网络自动化，它可以在这种情况下为您提供帮助。

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

begin = time.time()

options = Options()
options.headless = True
options.add_argument('--log-level=3')
driver = webdriver.Chrome(options=options)

由于 URL 不会更改，因此您必须通过获取其 xpath 并单击所需的按钮：

driver.find_element_by_xpath('xpath code').click()

您可以避免使用请求，您可以使用以下方法获取页面的源代码：

html_text = driver.page_source
soup = BeautifulSoup(html_text, 'lxml')

当使用请求和beautifulsoup加载更多内容时，我无法抓取下一页上url不会更改的网站

问题描述

1 个解决方案

解决方案1
0 2021-06-17 18:51:16

当使用请求和beautifulsoup加载更多内容时，我无法抓取下一页上url不会更改的网站

问题描述

1 个解决方案

解决方案1 0 2021-06-17 18:51:16

解决方案1
0 2021-06-17 18:51:16