简体   繁体   中英

Web scrape with rvest from a table that is not defined

I am trying to get a table from this website : http://www.oddsportal.com/american-football/usa/nfl-2012-2013/results/

I actually want to get the table in the middle of the page.

I tried different ways but in vain.

library("rvest")
library(dplyr)

url1 <- "http://www.oddsportal.com/american-football/usa/nfl-2012-2013/results/"
table <- url1 %>%
  read_html() %>%
  html_nodes(xpath='//*[@id="tournamentTable"]') %>% 
  html_table(fill = T)

This does not work because i believe that the table is not defined as table.

I also tried to grab the rows separately by using:

 df <- mps1 %>% 
     html_nodes(css = "tr.odd.deactivate,tr.center.nob-border")

but it obtains nothing.

Any idea how can I do it?

thanks

Based on previous questions by people trying to scrape from this site, this table is probably dynamically generated. As far as I know, the only way to deal with pages like this is to use RSelenium - which basically automates a browser.

After a lot of trial and error, the following code seems to work (using Chrome on Windows 10)...

library(RSelenium)
library(rvest)
library(dplyr)

url <- "http://www.oddsportal.com/american-football/usa/nfl-2012-2013/results/"
rD <- rsDriver(port=4444L,browser="chrome")
remDr <- rD$client

remDr$navigate(url)

page <- remDr$getPageSource()
remDr$close() #you can leave open if you are doing several of these: close at the end

table <- page[[1]] %>%
  read_html() %>%
  html_nodes(xpath='//table[@id="tournamentTable"]') %>% #specify table as there is a div with same id
  html_table(fill = T)

table <- table[[1]]

head(table)

  American Football» USA»NFL 2012/2013   American Football» USA»NFL 2012/2013   American Football» USA»NFL 2012/2013 American Football» USA»NFL 2012/2013 American Football» USA»NFL 2012/2013 American Football» USA»NFL 2012/2013
1              03 Feb 2013 - Play Offs                03 Feb 2013 - Play Offs                03 Feb 2013 - Play Offs              03 Feb 2013 - Play Offs                                 1.00                                 2.00
2                                                                                                                                                                                           NA                                   NA
3                                23:30 San Francisco 49ers - Baltimore Ravens San Francisco 49ers - Baltimore Ravens                                31:34                                 1.49                                 2.71
4              28 Jan 2013 - All Stars                28 Jan 2013 - All Stars                28 Jan 2013 - All Stars              28 Jan 2013 - All Stars                                 1.00                                 2.00
5                                                                                                                                                                                           NA                                   NA
6                                00:00                              NFC - AFC                              NFC - AFC                                62:35                                 2.03                                 1.83
  American Football» USA»NFL 2012/2013
1                                  B's
2                                     
3                                    9
4                                  B's
5                                     
6                                    9

The odds are coming out as decimal numbers, unfortunately, but hopefully you can work with that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM