简体   繁体   中英

How to save the results from a for-loop when getting html websites in R?

I was wondering how to store and retrieve the data from a for loop when aiming to scrape multiple websites in R.

library(rvest)
library(dplyr)
library(tidyverse)
library(glue)

cont<-rep(NA,101)

countries <- c("au","at","de","se","gb","us")

for (i in countries) {
sides<-glue("https://www.beeradvocate.com/beer/top-rated/",i,.sep = "") 
html <- read_html(sides)
cont[i] <- html %>% 
  html_nodes("table") %>% html_table()
}

table_au <- cont[2] [[1]]

The idea is to get a list for each website respectively. If I ran my code, table_au will just show me NA, presumably because the loop results are not stored.

It would be awesome, if someone could help me.

BR,

Marco

We can extract all the tables in a list.

library(rvest)

url <- "https://www.beeradvocate.com/beer/top-rated/"
temp <- purrr::map(paste0(url, countries), ~{
          .x %>% 
           read_html() %>%
           html_nodes("table") %>% 
           html_table(header = TRUE) %>% .[[1]]
})

If you want data as different dataframes like tab_au , tab_at , we can name the list and use list2env to get data separately.

names(temp) <- paste0('tab_', countries)
list2env(temp, .GlobalEnv)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM