简体   繁体   中英

Error using the “prob” package in an R function

I'm attempting to write a function that uses the prob package to compute conditional probabilities. When using the function I continue to encounter the same error, which states an object within the function cannot be found.

Below is a reproducible example in which I compute a conditional probability without the function and then attempt to use the function to produce the same result. I'm not sure if the error is due to limitations with the prob package or an error on my part.

# Load prob package
library(prob)

# Set seed for reproducibility
set.seed(30)

# Sample data frame
sampledata <- data.frame(
  X <- sample(1:10),
  Y <- sample(c(-1, 0, 1), 10, replace=TRUE))

# Set probability space
S <- probspace(sampledata)

# Subset Y between -1 and 0
A <- subset(S, Y>=-1 & Y<=0)

# Subset X greater than 6
B <- subset(S, X>6)

# Compute conditional probability
P <- prob(A, given=B)

The above code produces the following probability:

> P
[1] 0.25

Attempting to write a function to calculate the same probability:

# Create function with data frame, variables, and conditional inputs
prob.function <- function(df, variable1, variable2, state1, state2, cond1){
  s <- probspace(df)
  a <- subset(s, variable1>=state1 & variable1<=state2)
  b <- subset(s, variable2>cond1)
  p <- prob(a, given=b)
  return(p)
}

# Demonstrate the function
test <- prob.function(sampledata, Y, X, -1, 0, 6)

This function gives the following error:

Error in eval(expr, envir, enclos) : object 'b' not found

Any help you can provide would be great.

Thanks!

This looks like a bug in prob .

When I run this in Vanilla R, I get the same error. But when I create an object b in my workspace, the error disapears:

> print(b)
Error in print(b) : object 'b' not found
> test <- prob.function(sampledata, Y, X, -1, 0, 6)
Error in eval(expr, envir, enclos) : object 'b' not found
>
> b <- "dummy variable"
> print(b)
[1] "dummy variable"
> test <- prob.function(sampledata, Y, X, -1, 0, 6)
> test
[1] 0.25
>

As a temporary workaround, just create a dummy b in your current environment.


As for the bug, if you look at the source for prob.default (which in the example above is what prob(a, given=b) is eventually calling), you'll see the following section:

if (missing(given)) {
    < cropped >
}
else {
    f <- substitute(given)
    g <- eval(f, x)                  <~~~~ 
    if (!is.logical(g)) {            <~~~~
        if (!is.data.frame(given))   <~~~~
            stop("'given' must be data.frame or evaluate to logical")
        B <- given

    }
    ...
    < cropped >
}

it is jumping from g to given , perhaps inadvertently? I would reach out to the package maintainer, as this may be an oversight.

I don't think this is a bug in package prob .

First, you should create you sampledata as

sampledata <- data.frame(
  X = sample(1:10),
  Y = sample(c(-1, 0, 1), 10, replace=TRUE))

Your original code creates not only this dataframe but also variables X and Y in the global environment which are actually being used later when you call your function.

Second, you shouldn't call subset() inside a function. Use bracket subsetting instead:

prob.function <- function(df, variable1, variable2, state1, state2, cond1){
  s <- probspace(df)
  a <- s[s[[variable1]]>=state1 & s[[variable1]]<=state2, ]
  b <- s[s[[variable2]]>cond1, ]
  p <- prob(a, given=b)
  return(p)
}

And pass variable1 and variable2 as strings:

test <- prob.function(sampledata, "Y", "X", -1, 0, 6)

Now you have test==0.25 , and no error.

References for what is going on:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM