简体   繁体   中英

R - setting a value based on a function applied to other values in the same row

I have a dataframe containing (surprise) data. I have one column which I wish to populated on a per-row basis, calculated from the values of other columns in the same row.

From googling, it seems like I need 'apply', or one of it's close relatives. Unfortunately I haven't managed to make it actually work.

Example code:

#Example function
getCode <- function (ar1, ar2, ar3){
  if(ar1==1 && ar2==1 && ar3==1){
    return(1)
  } else if(ar1==0 && ar2==0 && ar3==0){
    return(0)
  } 
  return(2)
}

#Create data frame
a = c(1,1,0)
b = c(1,0,0)
c = c(1,1,0)
df <- data.frame(a,b,c)
#Add column for new data
df[,"x"] <- 0

#Apply function to new column
df[,"x"] <- apply(df[,"x"], 1, getCode(df[,"a"], df[,"b"], df[,"c"]))

I would like df to be taken from:

  a b c x
1 1 1 1 0
2 1 0 1 0
3 0 0 0 0

to

  a b c x
1 1 1 1 1
2 1 0 1 2
3 0 0 0 0

Unfortunately running this spits out:

Error in match.fun(FUN) : 'getCode(df[, "a"], df[, "b"], df[, "c"])' is not a function, character or symbol

I'm new to R, so apologies if the answer is blindingly simple. Thanks.

A few things: apply would be along the dataframe itself (ie apply(df, 1, someFunc) ); it's more idiomatic to access columns by name using the $ operator.. so if I have a dataframe named df with a column named a , access a with df$a .

In this case, I like to do an sapply along the index of the dataframe, and then use that index to get the appropriate elements from the dataframe.

df$x <- sapply(1:nrow(df), function(i) getCode(df$a[i], df$b[i], df$c[i]))

As @devmacrile mentioned above, I would just modify the function to be able to get a vector with 3 elements as input and use it within an apply command as you mentioned.

#Example function
getCode <- function (x){
  ifelse(x[1]==1 & x[2]==1 & x[3]==1,
         1,
         ifelse(x[1]==0 & x[2]==0 & x[3]==0,
                0,
                2))    }


#Create data frame
a = c(1,1,0)
b = c(1,0,0)
c = c(1,1,0)

df <- data.frame(a,b,c)
df

#   a b c
# 1 1 1 1
# 2 1 0 1
# 3 0 0 0


# create your new column of results
df$x = apply(df, 1, getCode)
df

#   a b c x
# 1 1 1 1 1
# 2 1 0 1 2
# 3 0 0 0 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM