简体   繁体   English

R for 循环遍历向量,如果下标超出范围更改为某个值

[英]R for loop through vectors, if subscript out of bounds change to certain value

I have a vector list where sometimes the values range 1 to 7 and sometimes 1 to 5. I want to loop through them and get frequency count using the function table and then place those values into a data frame, but I receive a subscript out of bounds error.我有一个向量列表,其中有时值的范围是 1 到 7,有时是 1 到 5。我想遍历它们并使用 function table获取频率计数,然后将这些值放入数据框中,但我收到一个subscript out of bounds误差。 It does this because it expects an integer value.它这样做是因为它需要一个integer值。 When this happens, I would like to set the integer value to 0.发生这种情况时,我想将 integer 值设置为 0。

Is there an easy function I could wrap around the integer value, eg somefunction(t[[6]]) that returns 0 ?是否有一个简单的 function 我可以环绕integer值,例如somefunction(t[[6]])返回0

#list of vectors, the first has values 1 to 7, the second has 1 to 5, 
#the third is 1 to 7 again and is only included to show that my real problem has many
# more vectors to evaluate


vectors<-list(c(1,1,2,2,3,3,3,4,4,5,5,5,6,6,6,6,7,7,7,7,7),
c(1,1,2,2,3,3,3,4,4,5,5,5,5,5,5,5,5,5,5,5,5),
c(1,1,2,2,3,3,3,4,4,5,5,5,6,6,6,6,7,7,7,7,7))

#empty data frame
df<-data.frame()
#loop through list of vectors and get frequncy count per list
for (i in 1:length(vectors)) {
  #count frquency of each value as variable t
  t<-table(vectors[[i]])
      #put frequency count of each value in the data frame - the problem is 
      #that in the second vector, there are only values of 1 to 5, so t[[6]] 
      #reports "subscript out of bounds". I want to change this to a value of 0
  df<-rbind(df,cbind(t[[1]],t[[2]],t[[3]],t[[4]],t[[5]],t[[6]],t[[7]]))
}

df

Instead of looping, we can convert the list to a two column data.frame with stack after setting the names of the list and then apply table除了循环,我们可以在设置list名称后将list转换为带有stack的两列data.frame ,然后应用table

table(stack(setNames(vectors, seq_along(vectors)))[2:1])
#  values
#ind  1  2  3  4  5  6  7
#  1  2  2  3  2  3  4  5
#  2  2  2  3  2 12  0  0
#  3  2  2  3  2  3  4  5

The above would be a table object.上面将是一个table object。 If we need to convert to data.frame (without reshaping to 'long' format)如果我们需要转换为data.frame (无需重塑为“长”格式)

as.data.frame.matrix(table(stack(setNames(vectors, seq_along(vectors)))[2:1]))

Here, we apply the table only once and it would be more efficient and less complicated because it automatically finds the unique values.在这里,我们只应用一次table ,因为它会自动找到唯一值,所以它会更有效且更简单。 If we are looping, then we have to find the unique values beforehand to add missing levels to be counted as 0如果我们正在循环,那么我们必须事先找到唯一值以添加缺失的级别以计为 0


With a loop, we can convert the individual list elements to factor with levels specified as the unique of all the elements使用循环,我们可以将单个list元素转换为factor ,其levels指定为所有元素的unique

un1 <- sort(unique(unlist(vectors)))
t(sapply(vectors, function(x) table(factor(x, levels = un1))))

In the for loop, we could use rbind , but with rbind it would expect the column names to be same or the lengths to be same.for循环中,我们可以使用rbind ,但使用rbind它会期望列名相同或长度相同。 So, instead of rbind , an option is bind_rows from dplyr所以,而不是rbind ,一个选项是来自bind_rowsdplyr

library(dplyr)
df <- data.frame()
for(i in seq_along(vectors)) {
      tbl1 <- table(vectors[[i]])
      df <- bind_rows(df, tbl1)
 }

By default, bind_rows fills with NA for columns that are not found.默认情况下,对于未找到的列, bind_rows填充NA Then we replace the NA to 0然后我们将NA替换为 0

df[is.na(df)] <- 0

But, this is not an efficient option as the one showed with calling table once但是,这不是一个有效的选择,因为调用table显示了一次

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM