在 R 中獲取 html 個網站時，如何保存 for 循環的結果？

Question

我想知道如何在 R 中抓取多個網站時如何從 for 循環中存儲和檢索數據。

library(rvest)
library(dplyr)
library(tidyverse)
library(glue)

cont<-rep(NA,101)

countries <- c("au","at","de","se","gb","us")

for (i in countries) {
sides<-glue("https://www.beeradvocate.com/beer/top-rated/",i,.sep = "") 
html <- read_html(sides)
cont[i] <- html %>% 
  html_nodes("table") %>% html_table()
}

table_au <- cont[2] [[1]]

這個想法是分別為每個網站獲取一個列表。 如果我運行我的代碼，table_au 將只顯示 NA，大概是因為沒有存儲循環結果。

如果有人可以幫助我，那就太棒了。

BR,

馬可

Answer 1

我們可以提取列表中的所有表。

library(rvest)

url <- "https://www.beeradvocate.com/beer/top-rated/"
temp <- purrr::map(paste0(url, countries), ~{
          .x %>% 
           read_html() %>%
           html_nodes("table") %>% 
           html_table(header = TRUE) %>% .[[1]]
})

如果你想要數據作為不同的數據框，比如tab_au ， tab_at ，我們可以命名列表並使用list2env分別獲取數據。

names(temp) <- paste0('tab_', countries)
list2env(temp, .GlobalEnv)

在 R 中獲取 html 個網站時，如何保存 for 循環的結果？

問題描述

1 個解決方案

解決方案1
1 已采納 2020-04-15 09:30:15

在 R 中獲取 html 個網站時，如何保存 for 循環的結果？

問題描述

1 個解決方案

解決方案1 1 已采納 2020-04-15 09:30:15

解決方案1
1 已采納 2020-04-15 09:30:15