BeautifulSoup web scraping multiple pages URL doesn't change

Question

When using beautiful soup to webscrape reviews I have an issue when it comes to "All Audience" Reviews. The URL doesn't update when changing review list pages.

Here is an example: https://www.rottentomatoes.com/m/midsommar/reviews?type=user

No change in the URL is made when clicking next.

Based on some of the other answers available in another thread I tried (and I might be saying this wrong) tracking xhr request, I believe the exact script that is being run is what I have highlighted in the picture here(I don't have 10 reputation so can't post image).

Network Method Post

When I look into the header of that GET action I see a Request URL, and when I try that it has all of the info I need, the problem is I don't know their naming convention for going to the next page. Below is how the RequestURLs change between pages.

Request URL page 1->2

Request URL page 2->3

How can I get beautiful soup to iterate over these?

Thanks!

Below should be enough code to get by attempting this, ignore some of the naming.

from bs4 import BeautifulSoup as soup
from urllib.request import Request, urlopen

x = input('What Movie?').replace(" ", "_").lower()

req_rot = Request('https://www.rottentomatoes.com/m/' + str(x) + '/reviews?type=user', headers={'User-Agent': 'Mozilla/5.0'})

webpage_rot = urlopen(req_rot).read()

page_soup_rot = soup(webpage_rot, "html.parser")

reviews_rot = page_soup_rot.findAll("div",{"class":"audience-reviews__review-wrap"})

z_rot = re.findall(r'js-clamp"(.+)</p>', str(reviews_rot))

Movie_Adj_rot = re.sub("[^\w]", " ",  str(z_rot)).split()

Answer 1

The better description for this issue is windowed pagination, the simplest solution I found was to just learn selenium and insert a scrape function within a ranged loop of clicking the next button element on each page.

BeautifulSoup web scraping multiple pages URL doesn't change

Question

1 answers

solution1
0 ACCPTED 2019-09-22 22:29:54

BeautifulSoup web scraping multiple pages URL doesn't change

Question

1 answers

solution1 0 ACCPTED 2019-09-22 22:29:54

solution1
0 ACCPTED 2019-09-22 22:29:54