I want to loop over a dataframe containing urls using rvest in r

Question

First i scrape a certain amount of urls from a website and collect them into a dataframe. However i want to loop over the urls which i collected into the dataframe. This is my code:

library(rvest)library(dplyr)
library(XLConnect)
##########GET URLS###################################################################################
urls <- read_html("http://www.klassiekshop.nl/labels/labels-a-e/brilliant-classics/?limit=all")

urls <- urls %>% 
  html_nodes(".product-name a") %>% 
  html_attr("href") %>%
  as.character()

url <- as.data.frame(urls)
as.character(url$urls)


#########EXTRACT URLS FROM DATAFRAME URLS############################################################
#########CREATE DATAFRAME############################################################################
EAN <- 0
price <- 0

df <- data.frame(EAN, price)

#########GET DATA####################################################################################
pricing_data <- for(i in urls){

site <-read_html(i)
print(i)
  stats <- data.frame(EAN =site %>% html_node("b") %>% html_text() ,
               price =site %>% html_node(".price") %>% html_text() ,
               stringsAsFactors=FALSE)
 data <-rbind(df,stats)
}

When debugging the loop runs over the urls. However it doesn't collect the data. Does anyone know how to get the data from the site?

Thanks!

Answer 1

这是因为您正在将df rbind到stats ……但您从未更改过df ...我想您想将代码的最后一行更改为： df <-rbind(df,stats)

I want to loop over a dataframe containing urls using rvest in r

Question

1 answers

solution1
0 ACCPTED 2016-10-24 18:07:40

I want to loop over a dataframe containing urls using rvest in r

Question

1 answers

solution1 0 ACCPTED 2016-10-24 18:07:40

solution1
0 ACCPTED 2016-10-24 18:07:40