Not all HTML elements returned from Beautifulsoup find_all method

Question

Trying to use Beautiful soup to pull data from a website.However when I use find_all function I get only a subset of target elements (li), so in this case instead of getting 24 li items only 12 are returned.

** Sample code **

from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'https://www.tomford.com/beauty/lips/'
headers = {'User-Agent': <using my useragent>}
reqs = requests.get(url,headers)
soup = BeautifulSoup(reqs.text, 'lxml')


ul_search_results=soup.find_all("li", {"class": "grid-tile"})

for li in ul_search_results:
  
  print("{0}".format(li.attrs.get('id')))

I have also tried, first fetching the parent element of all the li's using soup.find_all("ul",{"id":"search-result-items"} and tried iterating it for li tags. That hasn't returned the complete results too!

Appreciate any help here!

Answer 1

This is happening because the site only shows 12 items to begin with. In the browser, when you scroll down it makes a second request and loads another 12.

The second request it makes is this url https://www.tomford.com/beauty/lips/?start=12&sz=12&format=page-element&rendertype=macro

You can change this url to suit your needs. Instead change start to 0 and sz to 1000 and you should get a page with all available items.

https://www.tomford.com/beauty/lips/?start=0&sz=1000&format=page-element&rendertype=macro

Not all HTML elements returned from Beautifulsoup find_all method

Question

1 answers

solution1
0 ACCPTED 2021-03-11 04:12:29

Not all HTML elements returned from Beautifulsoup find_all method

Question

1 answers

solution1 0 ACCPTED 2021-03-11 04:12:29

solution1
0 ACCPTED 2021-03-11 04:12:29