简体   繁体   English

使用应用于R中数据帧的函数计算变量

[英]Calculate variable with a function that is applied to a data frame in R

I am trying to create a reusable function that calculates conversion which will applied to a data frame and return the value (or NA) based on a few conditions of other variables. 我试图创建一个可重用的函数来计算转换,该转换将应用于数据帧并根据其他变量的一些条件返回值(或NA)。 This is my first attempt at creating a multi conditional calculation in a function. 这是我在函数中创建多条件计算的首次尝试。

It will first look at a var called parentID which is a categorical var. 首先将查看一个名为parentID的变量,它是一个分类变量。 Only the value 377 will be calculated differently. 仅值377将被不同地计算。 then it will look at the values of two vars leads and clicks to check whether they have values greater than 1. If not it will return NA. 然后它将查看两个vars Leads的值,并单击以检查它们的值是否大于1。否则返回NA。 Then it will decide if leads or sales was greater and make the calculation based on which was greater. 然后,它将确定潜在客户或销售额是否更大,并根据哪个更大。

The calculation is a simple: x$sales / x$clicks or x$leads / x$clicks 计算很简单:x $ sales / x $ clicks或x $ leads / x $ clicks

set_cr <- function(x) {
  if (x$parentID==377) {
    if (x$leads < 1 | x$clicks < 1) {
      return(NA)
    }
    else {
      if (x$leads > x$sales) {
      cr <- x$leads / x$clicks
      return(cr)
      }
      else {
        cr <- x$sales / x$clicks
        return(cr)
      }
    }
  }
  else {
    if (x$parentID != 377) {
      if (x$sales < 1 | x$clicks < 1) {
        return(NA)
      }
      else {
        cr <- x$sales / x$clicks
        return(cr)
      }
    }
  }

  return(NA)
}

I am then applying this to a data frame using: 然后,我使用以下方法将其应用于数据框:

apply(df, 1, set_cr)

I would have assumed to see the values printed in the console but this has been throwing many errors and after searching and checking multiple resources I have not been able to debug. 我本来希望看到控制台中打印的值,但是这已经引发了许多错误,并且在搜索并检查了多个资源之后,我无法进行调试。 From here I would have used this to create ax$cr var in the data frame. 从这里开始,我将使用它在数据框中创建ax $ cr var。

A sample data set for this question: 此问题的样本数据集:

structure(list(parentID = c(377, 377, 311, 322, 333), clicks = c(9078, 
78404, 398443, 16142, 111715), sales = c(69, 95, 7191, 146, 33966
), leads = c(500, 0, 500, 0, 33966)), .Names = c("parentID", "clicks", 
"sales", "leads"), row.names = c(NA, 5L), class = "data.frame")

parentID clicks sales leads
     377   9078    69   500
     377  78404    95     0
     311 398443  7191   500
     322  16142   146     0
     333 111715 33966 33966

If there is a better way to share this data example please let me know and I can edit this. 如果有更好的方法共享此数据示例,请告诉我,我可以对其进行编辑。 I recall a package but couldn't locate it in rseek or on crantastic for reusable data sets. 我记得一个程序包,但是找不到可重复使用的数据集的地方。

Thanks in advance. 提前致谢。

apply , when used on a data frame, turns it into a matrix. apply ,当用于数据帧时,将其转换为矩阵。 If your data frame contains character or factor variables, ther esult will be a character matrix, and your code will fail. 如果您的数据框包含字符或因子变量,则结果将是字符矩阵,并且代码将失败。

In this case, however, you don't need apply . 但是,在这种情况下,您不需要apply You can vectorise your code with nested ifelse s: 您可以使用嵌套的ifelse s对代码进行向ifelse

set_cr <- function(x) 
{
    ifelse(x$parentID == 377,
    ifelse(x$leads < 1 || x$clicks < 1, NA, x$leads / x$clicks),
    ifelse(x$sales < 1 || x$clicks < 1, NA, x$sales / x$clicks))
}

set_cr(df)

(I assume you made a typo in the second else code block.) (我假设您在第二个else代码块中输入了错字。)

Try using 尝试使用

x['var'] instead of x$var

Your function should work.. 您的功能应该可以工作。

set_cr <- function(x) {
  if (x['parentID']==377) {
if (x['leads'] < 1 || x['clicks'] < 1) {
  return(NA)
}
else {
  if (x['leads'] > x['sales']) {
  cr <- x['leads'] / x['clicks']
  return(cr)
  }
  else {
    cr <- x['sales'] / x['clicks']
    return(cr)
  }
 }
}
 else {
if (x['parentID'] != 377) {
  if (x['sales'] < 1 || x['clicks'] < 1) {
    return(NA)
  }
  else {
    cr <- x['sales'] / x['clicks']
    return(cr)
  }
}
 }
return(NA)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM