简体   繁体   中英

Nested For and If Loops in R

I'm trying to write a function code for a clinical test in R. My R skills are quite rusty and I would really appreciate any help with it.

The function I am trying to write takes 31 values (there are 31 questions in the clinical test that a patient fills out). These 31 values are then scored separately (most questions have different ranges), and then combined together to get the weighted average for different parameters.

The scoring ranges:

for Q 1(defined as x1) - multiply the response by 10

for Q 2,6,5,9 - (scored on a scale of 6) score them as
1 - 100
2 - 80
3 - 60
4 - 40
5 - 20
6 - 0.

for Q 3,4,7,8,10,11,12,13,16,17,18 (scored on a scale of 6)
1 - 0
2 - 20
3 - 40
4 - 60
5 - 80
6 - 100

for Q 14, 25, 26, 27, 28, 29, 30 (scored on a scale of 5)
1 - 100
2 - 75
3 - 50
4 - 25
5 - 0

for Q 19,20 (scored on a scale of 5)
1 - 0
2 - 25
3 - 50
4 - 75
5 - 100

for Q 15, 21, 23, 24 (scored on a scale of 4)
1 - 0
2 - 33.3
3 - 66.7
4 - 100

for Q 22
1 - 0
2 - 50
3 -100

qolie31 <- function(x1, x2, x3, ...){
  x1a <- x1*10 
  z <- c(x2, x5, x6, x9)  
  {for (i in z){
    if (i==1){x == 100}
    else if(i==2){x == 80}
    else if(i==3){x==60}
    else if(i==4){x==40}
    else if(i==5){x==20}
    else (i==6){x==0}
    z2 <- x
  }
}

My questions:

  1. I've used the ... function on the first line of code to define that I need arguments from x1 to x31. My end goal is not to define them manually from 1 to 31. Please could someone tell me how to define arguments from x1 to x31, without manually writing on there

  2. How do I save the new score in the function, so that I can use that later for analysis?

In general, you can capture arbitrary numbers of arguments with ... by using list(...) . See more in this other question . However, this is normally best when you think that you won't know how many arguments are going to be supplied and you want to be able to handle that anyway. In this case, you know there should be 31 answers so ... is not appropriate. Instead, you should try to have your answers stored in a length 31 vector and supply that as the argument. Example below. Here I create short oneliners to transform each answer group according to the rules you laid out. This takes advantage of R's math functions which I think is cleaner (and faster?) than using if statements for everything. Then we just apply the transformation to each set of the answers and assign them to the output scores. Example with some random answers 1-3 shown.

If you are worried about typos being a problem, I included some commented code using assert_that to check for errors. You could check inside each score_ function that the answer is in the right range, for example an answer to question 22 shouldn't have value 4.

For the last part, you don't need to include an assignment inside the function. Just make sure it returns what you want and do the assignment when you call the function, as below.

eg_ans <- sample.int(3, 31, replace = TRUE)

transform_scores <- function(answers){
  # assertthat::assert_that(
  #   length(answers) == 31,
  #   msg = "There are not 31 values in input vector"
  # )
  score1 <- function(ans) ans * 10
  score6a <- function(ans) (6 - ans) * 20
  score6b <- function(ans) (ans - 1) * 20
  score5a <- function(ans) (5 - ans) * 25
  score5b <- function(ans) (ans - 1) * 25
  score4 <- function(ans) (ans - 1) * (100 / 3)
  score3 <- function(ans) (ans - 1) * 50

  scores <- numeric(31)
  scores[1] <- score1(answers[1])
  scores[c(2, 5:6, 9)] <- score6a(answers[c(2, 5:6, 9)])
  scores[c(3:4, 7:8, 10:13, 16:18)] <- score6b(answers[c(3:4, 7:8, 10:13, 16:18)])
  scores[c(14, 25:30)] <- score5a(answers[c(14, 25:30)])
  scores[19:20] <- score5b(answers[19:20])
  scores[c(15, 21, 23:24)] <- score4(answers[c(15, 21, 23:24)])
  scores[22] <- score3(answers[22])
  return(scores)
}

eg_scores <- transform_scores(eg_ans)
eg_scores
#>  [1]  30.00000  60.00000   0.00000  20.00000 100.00000 100.00000   0.00000
#>  [8]  20.00000  60.00000  20.00000   0.00000  40.00000   0.00000  75.00000
#> [15]  66.66667   0.00000   0.00000  20.00000  50.00000  50.00000  66.66667
#> [22] 100.00000   0.00000  33.33333 100.00000  75.00000 100.00000 100.00000
#> [29] 100.00000  50.00000   0.00000

Created on 2018-04-24 by the reprex package (v0.2.0).

You could use the mapvalues function from the plyr package.

    rescaleq<- function(x){
    require(plyr)
    if (length(x) != 30) stop("Vector of 30 elements required")
    x[1]<- x[1]*10
    x[c(2, 5, 6, 9)]<- mapvalues(x[c(2, 5, 6, 9)], from = 1:6, to = seq(100, 0, by = -20))
    x[c(3,4,7,8,10,11,12,13,16,17,18)]<- mapvalues(x[c(3,4,7,8,10,11,12,13,16,17,18)], from  = 1:6, to = seq(0, 100, by = 20))
    x[c(14, 25, 26, 27, 28, 29, 30)]<- mapvalues(x[c(14, 25, 26, 27, 28, 29, 30)], from = 1:5, to = seq(100, 0, by = -25))
    x[c(19, 20)]<- mapvalues(x[c(19, 20)], from = 1:5, to = seq(0, 100, by = 25))
    x[c(5, 21, 23, 24)]<- mapvalues(x[c(5, 21, 23, 24)], from = 1:4, to = seq(0, 100, length.out = 4))
     x[22]<- mapvalues(x[22], from = 1:3, to = seq(0, 100, by = 50))
    return(round(x, 2))
}

And to test it with some data:

> xvector <- sample.int(3, 31, replace=T)
> xvector
# [1] 2 1 3 2 2 3 2 1 1 3 1 3 1 1 1 1 2 1 3 1 1 2 1 1 2 2 3 1 3 3 
> rescaleq(xvector[-31]) # Note that below, these are messages NOT errors or warnings
#The following `from` values were not present in `x`: 4, 5, 6
#The following `from` values were not present in `x`: 4, 5, 6
#The following `from` values were not present in `x`: 4, 5
#The following `from` values were not present in `x`: 2, 4, 5
#The following `from` values were not present in `x`: 3, 4
#The following `from` values were not present in `x`: 1, 3
# [1]  20.00 100.00  80.00  60.00 100.00  40.00  20.00  20.00   0.00  40.00   0.00  40.00
#[13]   0.00   0.00  20.00   0.00 100.00  75.00  75.00  50.00 100.00  50.00  50.00  50.00
#[25]   0.00  33.33   0.00   0.00   0.00  50.00

If you want to remove the messages generated by mapvalues , try wrapping suppressMessages around them, ie suppressMessages(mapvalues(x[c(2, 5, 6, 9)], from = 1:6, to = seq(100, 0, by = -20))) etc.

Another way, this time using the tidyverse and a lookup table:

library(tidyverse)

data = "
1                             | 10
2,6,5,9                       | 100,80,60,40,20,0
3,4,7,8,10,11,12,13,16,17,18  | 0,20,40,60,80,100
14, 25, 26, 27, 28, 29, 30    | 100,75,50,25,0
19,20                         | 0,25,59,75,100
15, 21, 23, 24                | 0, 33.3, 66.7, 100
22                            | 0,50,100
"

df <- read.table(text = data, sep = '|', 
                 stringsAsFactors = F, 
                 col.names = c('q', 'factor'),
                 strip.white = T)

# create the lookup table
# save it somewhere
# as we only need to generate it once
lookup <- df %>%
  separate_rows(q, sep = ',') %>%
  separate_rows(factor, sep = ',', convert = T) %>%
  group_by(q) %>%
  mutate(item = 1:n()) %>%
  ungroup()

# calculate the score
calc_score <- function(x) {
  score <- 0
  for (i in seq_along(x)) {
    f <- lookup %>% filter(q == i, item == x[i]) %>% select(factor) %>% pull()
    score <- score + i * f
  }
  score
}

v <- c(1,4,3)
(score <- calc_score(v))

This yields a score of 210 for this example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM