简体   繁体   中英

I can't scrape a website where url not change on its next page when load more using requests and beautifulsoup

Python Beatifulsoup requests

import requests
import re
import os
import csv
from bs4 import BeautifulSoup





for d in searche:
    truelink = d.replace(" ","-")
    truelinkk=('https://www.fb.com

    r = requests.get(truelinkk,headers=headers).text
    soup=BeautifulSoup(r,'lxml')
    mobile=soup.find_all('li',class_='EIR5N')
 

I am beginner to python. I can't scrape a website where url doesn't change on its next page when load more using requests and beautifulsoup please can someone visit the site let me know the procedure for scraping above websites using beautifulsoup and requests. Any answer would be appreciated Thankyou Please look this link https://www.olx.in/hyderabad_g4058526/q-Note-9-max-pro?isSearchCall=true

You can use selenium in headless mode instead of requests . Eventho selenium is used for web automation it can help you in this case.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

begin = time.time()

options = Options()
options.headless = True
options.add_argument('--log-level=3')
driver = webdriver.Chrome(options=options)

Since the URL doesn't change you have to click on the button that you want by getting its xpath and:

driver.find_element_by_xpath('xpath code').click()

You can avoid using requests and you can get the source code of the page by using:

html_text = driver.page_source
soup = BeautifulSoup(html_text, 'lxml')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM