简体   繁体   English

在满足阈值时使用循环编写函数

[英]Writing a function with a loop when threshold is met

I have a data frame (data) which consist of 639 data and has 6 columns. 我有一个数据框(数据),由639个数据组成,有6列。 Each cell represents time in seconds. 每个单元表示以秒为单位的时间 I computed threshold for each column. 我计算了每列的阈值。

So far I have done this: calculated thresholds per column. 到目前为止,我已经这样做了:每列的计算阈值。 So 6 thresholds for 6 columns 因此6列的6个阈值

threshold1
[1] 16 22 31  6 11 13

threshold2
[1] 200.0 275.0 387.5  75.0 137.5 162.5

This threshold represents min and max seconds per column. 此阈值表示每列的最小和最大秒数。 So I would like to (do this for all columns): in column 1 highlight all the cells that have value below 16 seconds and all cells that have value greater than 200 seconds. 所以我想(对所有列执行此操作):在第1列中突出显示值低于16秒的所有单元格以及值大于200秒的所有单元格。

I already did this by: 我已经这样做了:

column1<-ifelse(data$column1<threshold1[1],"speeder",     
         ifelse(data$column1>threshold2[1], "slower",1))


column2<-ifelse(data$column2<threshold1[2],"speeder",     
         ifelse(data$column2>threshold2[2], "slower",1))

column3<-ifelse(data$column3<threshold1[3],"speeder",     
         ifelse(data$column3>threshold2[3], "slower",1))

and so on for all 6 columns. 所有6列都是如此。

Now I would like to write this in a loop so I wouldn't need to manually write function ifelse every time, because I have different data sets that consist of different number of columns. 现在我想在循环中编写它,所以我不需要每次都手动编写函数ifelse ,因为我有不同的数据集,包含不同数量的列。

First generate data, named "dat": 首先生成名为“dat”的数据:

dat <- data.frame(
    column1 = runif(n = 638, min=0, max=220),
    column2 = runif(n = 638, min=0, max=300),
    column3 = runif(n = 638, min=0, max=400),
    column4 = runif(n = 638, min=0, max=100),
    column5 = runif(n = 638, min=0, max=150),
    column6 = runif(n = 638, min=0, max=200))

# define thresholds    
threshold1 <- c(16, 22, 31, 6, 11, 13)
threshold2 <- c(200.0, 275.0, 387.5, 75.0, 137.5, 162.5)

Using a loop 使用循环

# Declare a list that will contain the results
results <- list()

# Loop over the columns
for(i in seq_len(ncol(dat))) {
    results[[colnames(dat)[i]]] <- ifelse(dat[,i] < threshold1[i],
                                          yes = "speeder", 
                                          no = ifelse(dat[,i] > threshold2[i], 
                                                      yes = "slower", no = 1))
}

Using lapply 使用lapply

You could also use lapply instead of a loop, like so: 您也可以使用lapply而不是循环,如下所示:

results <- lapply(1:ncol(dat), function(x) {
    ifelse(dat[,x] < threshold1[x],
           yes = "speeder", 
           no = ifelse(dat[,x] > threshold2[x],
                       yes = "slower", no = 1))
})

names(results) <- colnames(dat)

Results 结果

You can access the results with results[[1]] to results[[6]] or with results$column1 to results$column6 您可以使用results[[1]]results[[6]]results$column1results$column6访问results$column6

> head(results$column1, 100)

  [1] "1"       "1"       "1"       "1"       "1"       "1"       "slower" 
  [8] "1"       "slower"  "1"       "1"       "1"       "speeder" "1"      
 [15] "1"       "1"       "1"       "1"       "1"       "1"       "1"      
 [22] "slower"  "1"       "1"       "1"       "1"       "1"       "1"      
 [29] "1"       "1"       "1"       "slower"  "1"       "slower"  "slower" 
 [36] "1"       "1"       "1"       "1"       "speeder" "1"       "1"      
 [43] "1"       "1"       "speeder" "speeder" "1"       "1"       "slower" 
 [50] "1"       "1"       "slower"  "1"       "1"       "1"       "1"      
 [57] "1"       "1"       "1"       "1"       "1"       "1"       "1"      
 [64] "1"       "1"       "1"       "1"       "slower"  "1"       "1"      
 [71] "slower"  "1"       "1"       "1"       "speeder" "1"       "1"      
 [78] "1"       "1"       "1"       "1"       "slower"  "1"       "1"      
 [85] "1"       "1"       "1"       "1"       "1"       "1"       "1"      
 [92] "1"       "1"       "1"       "1"       "1"       "1"       "1"      
 [99] "speeder" "1" 

You can tryout lapply as well.. It would be faster than loop.. 你也可以试试lapply ..它会比循环更快..

dat <- data.frame(
  column1 = runif(n = 638, min=0, max=220),
  column2 = runif(n = 638, min=0, max=300),
  column3 = runif(n = 638, min=0, max=400),
  column4 = runif(n = 638, min=0, max=100),
  column5 = runif(n = 638, min=0, max=150),
  column6 = runif(n = 638, min=0, max=200))

# define thresholds    
threshold1 <- c(16, 22, 31, 6, 11, 13)
threshold2 <- c(200.0, 275.0, 387.5, 75.0, 137.5, 162.5)

result = matrix(unlist(lapply(seq(6), function(i){
  ifelse(dat[,i] < threshold1[i],
         yes = "speeder", 
         no = ifelse(dat[,i] > threshold2[i], 
                     yes = "slower", no = 1))
})), ncol = 6, byrow = FALSE)

head(result)
     [,1]      [,2] [,3] [,4]     [,5] [,6]
[1,] "speeder" "1"  "1"  "slower" "1"  "1" 
[2,] "1"       "1"  "1"  "1"      "1"  "1" 
[3,] "1"       "1"  "1"  "1"      "1"  "1" 
[4,] "1"       "1"  "1"  "slower" "1"  "1" 
[5,] "1"       "1"  "1"  "1"      "1"  "1" 
[6,] "1"       "1"  "1"  "slower" "1"  "1" 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM