簡體   English   中英

R-循環存儲的cbind()結果的累積存儲以及對double for循環可能適用的解決方案

[英]R - Cumulative storage of looped cbind() results and possible lapply solution to double for-loop

我已經找到了解決該問題的方法,該解決方案是我根據@Ryan的建議(由以下代碼給出)解決的:

for (i in seq_along(url)){

  webpage <- read_html(url[i]) #loop through URL list to access html data

  fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
  fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
  fac_data <- c(fac_data, fac_data1) #Store table data on each URL in a variable 

  x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data

  for (j in seq_along(headers[[i]])){
    y <- cbind(x[,j]) #extract column data and store in temporary variable
    colnames(y) <- as.character(headers[[i]][j]) #add column name
    print(cbind(y)) #loop through headers list to print column data in sequence. ** cbind(y) will be overwritten when I try to store the result on a list with 'z <- cbind(y)'.
  }
}

現在,我可以打印出所有值,並加上有關數據的標題。


一些后續問題將是:

  1. 如何將cbind(y)的輸出累積保存在data.frame或列表中? 遍歷cbind(y)將覆蓋值,這使我只剩下最后一張表中的最后一列。 像這樣:

    退休年月

    [1,]“ 82年8月”

這些變化都不起作用:

z[[x]][j] <- cbind(y)

> source('~/Google 雲端硬盤/R/scrapeFaculty.R')
Error in `*tmp*`[[x]] : 最多只能選擇一個元素

z[j] <- cbind(y)

> source('~/Google 雲端硬盤/R/scrapeFaculty.R')
There were 13 warnings (use warnings() to see them)

z[[j]] <- cbind(y)

> source('~/Google 雲端硬盤/R/scrapeFaculty.R')
Error in z[[j]] <- cbind(y) : 用來替換的元素比所要替換的值多
  1. 可以使用簡單的lapply()函數代替double for循環來解決上述問題嗎?

編輯:

這是我用來解決此問題的最終代碼:

for (i in seq_along(url)){

  webpage <- read_html(url[i])

  fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
  fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
  fac_data <- c(fac_data, fac_data1)

  x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data
  y <- cbind(x[,1:length(headers[[i]])]) #extract column data
  colnames(y)<- as.character(headers[[i]]) #add colunm name
  ntu.hist[[i]] <- y #Cumulate results on a list.

}

我想知道是否可以一次綁定多個而不是循環綁定。 這些語法選項是否有幫助?

y <– data.frame(col1=c(1:3),col2=c(4:6),col3=c(7:9))

cbind(y[,c(1:3)])

  col1 col2 col3
1    1    4    7
2    2    5    8
3    3    6    9

#In R, you can use ":" to specify a range. So 1,2,3,4 is equal to 1:4.
#If you don't want number 3 in that range, you can use c(1,2,4).

#For example:

cbind(y[,c(1,3)])

  col1  col3
1    1     7
2    2     8
3    3     9

最終代碼:

這是最終代碼:

for (i in seq_along(url)){

  webpage <- read_html(url[i])

  fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
  fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
  fac_data <- c(fac_data, fac_data1)

  x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data
  y <- cbind(x[,1:length(headers[[i]])]) #extract column data
  colnames(y)<- as.character(headers[[i]]) #add colunm name
  ntu.hist[[i]] <- y #Cumulate results on a list.

}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM