简体   繁体   中英

How to scrape multiple pages with an unchanging url in R?

That is the url

My goal is to scrape the review section. But the url doesn't change. Code is given below:

url <- "https://www.n11.com/magaza/thbilisim/magaza-yorumlari"

getreviews <- function(master_df){
  as.data.frame(
    read_html(master_df) %>% 
      html_nodes("div.commentContainer p") %>% 
      html_text()
  )
}

reviews <- url %>% 
  map(getreviews) %>%  
  bind_rows()

How to scrape multiple pages with the same url? Thanks in advance.

If you are on a Chrome browser, for example, you can figure out the requested URL per page by going to the Chrome development tools (press F12) and looking at the Network pane.

In your example above, you will see that for every page, the requested URL is https://www.n11.com/component/render/sellerShopFeedbacks?page=page number&sellerId=2145005 , where page number is 1, 2, 3, ...

The requested URL pops up on the Network tab when you click on the relevant page number at the bottom of the original URL.

So you just need to increment the page number in your R code to see the subsequent pages.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM