使用R而不是for循环进行矢量化

Question

I am trying to vectorize the following task with one of the apply functions, but in vain. 我试图通过应用功能之一来向量化以下任务，但徒劳无功。 I have a list and a dataframe. 我有一个列表和一个数据框。 What I am trying to accomplish is to create subgroups in a dataframe using a lookup list. 我要完成的工作是使用查找列表在数据框中创建子组。

The lookup list (which are basically percentile groups) looks like the following: 查找列表（基本上是百分位数组）如下所示：

Look_Up_List
$`1`
   A   B     C     D     E
0.000 0.370 0.544 0.698 9.655 

$`2`
   A   B     C     D     E
0.000 0.506 0.649 0.774 1.192

The Curret Dataframe looks like this : Curret数据框如下所示：

Score Big_group
0.1     1
0.4     1 
0.3     2

Resulting dataframe must look like the following with an additional column. 结果数据框必须如下所示，并带有附加列。 It matches the score in the percentile bucket from the lookup list in the corresponding Big_Group: 它与相应的Big_Group的查找列表中的百分比桶中的分数匹配：

Score Big_group Sub_Group
0.1     1         A
0.4     1         B
0.3     2         A

Thanks so much 非常感谢

Answer 1

You can create a function like this: 您可以创建如下函数：

myFun <- function(x) {
  names(Look_Up_List[[as.character(x[2])]])[
    findInterval(x[1], Look_Up_List[[as.character(x[2])]])]
}

And apply it by row with apply : 并通过apply逐行apply ：

apply(mydf, 1, myFun)
# [1] "A" "B" "A"'

Answer 2

   # reproducible input data
    Look_Up_List <- list('1' <- c(A=0.000, B=0.370, C=0.544, D=0.698, E=9.655),
                         '2' <- c(A=0.000, B=0.506, C=0.649, D=0.774, E=1.192))
    Current <- data.frame(Score=c(0.1, 0.4, 0.3),
                          Big_group=c(1,1,2))

    # Solution 1
    Current$Sub_Group <- sapply(1:nrow(Current), function(i) max(names(Look_Up_List[[1]][Current$Score[i] > Look_Up_List[[1]] ])))

    # Alternative solution (using findInterval, slightly slower at least for this dataset)
    Current$Sub_Group <- sapply(1:nrow(Current), function(i) names(Look_Up_List[[1]])[findInterval(Current$Score[i], Look_Up_List[[1]])])

    # show result
    Current

使用R而不是for循环进行矢量化

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-07-11 16:45:41

解决方案2
0 2014-07-11 17:05:52

使用R而不是for循环进行矢量化

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-07-11 16:45:41

解决方案2 0 2014-07-11 17:05:52

解决方案1
1 已采纳 2014-07-11 16:45:41

解决方案2
0 2014-07-11 17:05:52