Web-Scraping using R. I want to extract some table like data from a website

Question

I'm having some problems scraping data from a website. I have not a lot of experience with web-scraping. My intended plan is to scrape some data using R from the following website: https://www.shipserv.com/supplier/profile/s/ww-grainger-inc-59787/brands

More precisely, I want to extract the brands on the right-hand side.

My idea so far:

brands <- read_html('https://www.shipserv.com/supplier/profile/s/w-w-grainger-inc-59787/brands') %>%         html_nodes(xpath='/html/body/div[1]/div/div[2]/div[2]/div[2]/div[4]/div/div/div[3]/div/div[1]/div') %>% html_text()

But this doesn't bring up the intended information. Some help would be really appreciated here! Thanks!

Answer 1

That data is dynamically pulled from a script tag. You can pull the content of that script tag and parse as json. subset just for the items of interest from the returned list and then extract the brand names:

library(rvest)
library(jsonlite)
library(stringr)

data <- read_html('https://www.shipserv.com/supplier/profile/s/w-w-grainger-inc-59787/brands') %>% 
  html_node('#__NEXT_DATA__') %>% html_text() %>% 
  jsonlite::parse_json()

data <- data$props$pageProps$apolloState
mask <- map(names(data), str_detect, '^Brand:') %>% unlist()  
data <- subset(data, mask)
brands <- lapply(data, function(x){x$name})

I find the above easier to read but you could try other methods such as

library(rvest)
library(jsonlite)
library(stringr)

brands <- read_html('https://www.shipserv.com/supplier/profile/s/w-w-grainger-inc-59787/brands') %>% 
  html_node('#__NEXT_DATA__') %>% html_text() %>% 
  jsonlite::parse_json() %>% 
  {.$props$pageProps$apolloState} %>% 
  subset(., {str_detect(names(.), 'Brand:')}) %>% 
  lapply(. , function(x){x$name})

Using {} to have call be treated like an expression and not a function is something I read in a comment by @asachet

Web-Scraping using R. I want to extract some table like data from a website

Question

1 answers

solution1
0 ACCPTED 2021-03-17 21:42:00

Web-Scraping using R. I want to extract some table like data from a website

Question

1 answers

solution1 0 ACCPTED 2021-03-17 21:42:00

solution1
0 ACCPTED 2021-03-17 21:42:00