简体   繁体   中英

Calculate variable with a function that is applied to a data frame in R

I am trying to create a reusable function that calculates conversion which will applied to a data frame and return the value (or NA) based on a few conditions of other variables. This is my first attempt at creating a multi conditional calculation in a function.

It will first look at a var called parentID which is a categorical var. Only the value 377 will be calculated differently. then it will look at the values of two vars leads and clicks to check whether they have values greater than 1. If not it will return NA. Then it will decide if leads or sales was greater and make the calculation based on which was greater.

The calculation is a simple: x$sales / x$clicks or x$leads / x$clicks

set_cr <- function(x) {
  if (x$parentID==377) {
    if (x$leads < 1 | x$clicks < 1) {
      return(NA)
    }
    else {
      if (x$leads > x$sales) {
      cr <- x$leads / x$clicks
      return(cr)
      }
      else {
        cr <- x$sales / x$clicks
        return(cr)
      }
    }
  }
  else {
    if (x$parentID != 377) {
      if (x$sales < 1 | x$clicks < 1) {
        return(NA)
      }
      else {
        cr <- x$sales / x$clicks
        return(cr)
      }
    }
  }

  return(NA)
}

I am then applying this to a data frame using:

apply(df, 1, set_cr)

I would have assumed to see the values printed in the console but this has been throwing many errors and after searching and checking multiple resources I have not been able to debug. From here I would have used this to create ax$cr var in the data frame.

A sample data set for this question:

structure(list(parentID = c(377, 377, 311, 322, 333), clicks = c(9078, 
78404, 398443, 16142, 111715), sales = c(69, 95, 7191, 146, 33966
), leads = c(500, 0, 500, 0, 33966)), .Names = c("parentID", "clicks", 
"sales", "leads"), row.names = c(NA, 5L), class = "data.frame")

parentID clicks sales leads
     377   9078    69   500
     377  78404    95     0
     311 398443  7191   500
     322  16142   146     0
     333 111715 33966 33966

If there is a better way to share this data example please let me know and I can edit this. I recall a package but couldn't locate it in rseek or on crantastic for reusable data sets.

Thanks in advance.

apply , when used on a data frame, turns it into a matrix. If your data frame contains character or factor variables, ther esult will be a character matrix, and your code will fail.

In this case, however, you don't need apply . You can vectorise your code with nested ifelse s:

set_cr <- function(x) 
{
    ifelse(x$parentID == 377,
    ifelse(x$leads < 1 || x$clicks < 1, NA, x$leads / x$clicks),
    ifelse(x$sales < 1 || x$clicks < 1, NA, x$sales / x$clicks))
}

set_cr(df)

(I assume you made a typo in the second else code block.)

Try using

x['var'] instead of x$var

Your function should work..

set_cr <- function(x) {
  if (x['parentID']==377) {
if (x['leads'] < 1 || x['clicks'] < 1) {
  return(NA)
}
else {
  if (x['leads'] > x['sales']) {
  cr <- x['leads'] / x['clicks']
  return(cr)
  }
  else {
    cr <- x['sales'] / x['clicks']
    return(cr)
  }
 }
}
 else {
if (x['parentID'] != 377) {
  if (x['sales'] < 1 || x['clicks'] < 1) {
    return(NA)
  }
  else {
    cr <- x['sales'] / x['clicks']
    return(cr)
  }
}
 }
return(NA)
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM