I'm trying to webscrape data from a website into R

Question

I'm not sure what I'm missing in my code. I'm trying to webscrape data from https://www.espn.com/nfl/standings/_/season/2010 into a tibble in R. My code so far is the following:

library(tidyverse)
library(rvest)

# url I want the data from. 
NFL_2010.url <- "https://www.espn.com/nfl/standings/_/season/2010"
# Use webscraping to import the data from the url into R
NFL_2010 <- NFL_2010.url %>%
  read_html(NFL_2010) %>%
  #There is more than 1 table, so I'm trying to use html_nodes 
  html_nodes("table") %>%
  html_table () %>%
  #convert data to a tibble
  as_tibble()

What am I missing here?

Answer 1

Webscraping of this page returns a list with all the tables split into 4 pieces. So you have to join these pieces together and then convert to 2 tibbles. For example:

library(tidyverse)
library(rvest)

NFL_2010.url <- "https://www.espn.com/nfl/standings/_/season/2010"

NFL_2010 <- NFL_2010.url %>%
  read_html() %>%
  html_nodes("table") %>%
  html_table()

# American Football Conference
NFL_2010_AFC <- bind_cols(NFL_2010[[1]], NFL_2010[[2]]) %>%
  as_tibble()

# National Football Conference
NFL_2010_NFC <- bind_cols(NFL_2010[[3]], NFL_2010[[4]]) %>%
  as_tibble()

And it still requires some bit of data cleaning after that...

I'm trying to webscrape data from a website into R

Question

1 answers

solution1
0

I'm trying to webscrape data from a website into R

Question

1 answers

solution1 0

solution1
0