[英]R for loop through vectors, if subscript out of bounds change to certain value
I have a vector list where sometimes the values range 1 to 7 and sometimes 1 to 5. I want to loop through them and get frequency count using the function table
and then place those values into a data frame, but I receive a subscript out of bounds
error.我有一个向量列表,其中有时值的范围是 1 到 7,有时是 1 到 5。我想遍历它们并使用 function
table
获取频率计数,然后将这些值放入数据框中,但我收到一个subscript out of bounds
误差。 It does this because it expects an integer
value.它这样做是因为它需要一个
integer
值。 When this happens, I would like to set the integer value to 0.发生这种情况时,我想将 integer 值设置为 0。
Is there an easy function I could wrap around the integer
value, eg somefunction(t[[6]])
that returns 0
?是否有一个简单的 function 我可以环绕
integer
值,例如somefunction(t[[6]])
返回0
?
#list of vectors, the first has values 1 to 7, the second has 1 to 5,
#the third is 1 to 7 again and is only included to show that my real problem has many
# more vectors to evaluate
vectors<-list(c(1,1,2,2,3,3,3,4,4,5,5,5,6,6,6,6,7,7,7,7,7),
c(1,1,2,2,3,3,3,4,4,5,5,5,5,5,5,5,5,5,5,5,5),
c(1,1,2,2,3,3,3,4,4,5,5,5,6,6,6,6,7,7,7,7,7))
#empty data frame
df<-data.frame()
#loop through list of vectors and get frequncy count per list
for (i in 1:length(vectors)) {
#count frquency of each value as variable t
t<-table(vectors[[i]])
#put frequency count of each value in the data frame - the problem is
#that in the second vector, there are only values of 1 to 5, so t[[6]]
#reports "subscript out of bounds". I want to change this to a value of 0
df<-rbind(df,cbind(t[[1]],t[[2]],t[[3]],t[[4]],t[[5]],t[[6]],t[[7]]))
}
df
Instead of looping, we can convert the list
to a two column data.frame
with stack
after setting the names of the list
and then apply table
除了循环,我们可以在设置
list
名称后将list
转换为带有stack
的两列data.frame
,然后应用table
table(stack(setNames(vectors, seq_along(vectors)))[2:1])
# values
#ind 1 2 3 4 5 6 7
# 1 2 2 3 2 3 4 5
# 2 2 2 3 2 12 0 0
# 3 2 2 3 2 3 4 5
The above would be a table
object.上面将是一个
table
object。 If we need to convert to data.frame
(without reshaping to 'long' format)如果我们需要转换为
data.frame
(无需重塑为“长”格式)
as.data.frame.matrix(table(stack(setNames(vectors, seq_along(vectors)))[2:1]))
Here, we apply the table
only once and it would be more efficient and less complicated because it automatically finds the unique values.在这里,我们只应用一次
table
,因为它会自动找到唯一值,所以它会更有效且更简单。 If we are looping, then we have to find the unique values beforehand to add missing levels to be counted as 0如果我们正在循环,那么我们必须事先找到唯一值以添加缺失的级别以计为 0
With a loop, we can convert the individual list
elements to factor
with levels
specified as the unique
of all the elements使用循环,我们可以将单个
list
元素转换为factor
,其levels
指定为所有元素的unique
值
un1 <- sort(unique(unlist(vectors)))
t(sapply(vectors, function(x) table(factor(x, levels = un1))))
In the for
loop, we could use rbind
, but with rbind
it would expect the column names to be same or the lengths to be same.在
for
循环中,我们可以使用rbind
,但使用rbind
它会期望列名相同或长度相同。 So, instead of rbind
, an option is bind_rows
from dplyr
所以,而不是
rbind
,一个选项是来自bind_rows
的dplyr
library(dplyr)
df <- data.frame()
for(i in seq_along(vectors)) {
tbl1 <- table(vectors[[i]])
df <- bind_rows(df, tbl1)
}
By default, bind_rows
fills with NA
for columns that are not found.默认情况下,对于未找到的列,
bind_rows
填充NA
。 Then we replace the NA
to 0然后我们将
NA
替换为 0
df[is.na(df)] <- 0
But, this is not an efficient option as the one showed with calling table
once但是,这不是一个有效的选择,因为调用
table
显示了一次
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.