简体   繁体   中英

Can't extract href link from html_node in rvest

When I use the rvest package xpath and and try to get the embedded links (football team names) from the sites I get an empty result. Could someone help this?

The code is as follows:

library(rvest)
 
url <- read_html('https://www.transfermarkt.com/premier-league/startseite/wettbewerb/GB1') 
    
xpath <- as.character('/html/body/div[2]/div[11]/div[1]/div[2]/div[2]/div')

url %>%
  html_node(xpath=xpath) %>% 
  html_attr('href')

You can get all the links using:

library(rvest)

url <- 'https://www.transfermarkt.com/premier-league/startseite/wettbewerb/GB1'


url %>%
  read_html %>%
  html_nodes('td.hauptlink a') %>%
  html_attr('href') %>%
  .[. != '#'] %>%
  paste0('https://www.transfermarkt.com', .) %>%
  unique() %>%
  head(20)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM