简体   繁体   中英

Trying to understand R error: Error in FUN(X[[i]], …) : only defined on a data frame with all numeric variables

I've been getting this error message and traceback:

Error in FUN(X[[i]], ...) : 
  only defined on a data frame with all numeric variables 

5 stop("only defined on a data frame with all numeric variables") 
4 FUN(X[[i]], ...) 
3 lapply(args, function(x) {
    x <- as.matrix(x)
    if (!is.numeric(x) && !is.complex(x)) 
        stop("only defined on a data frame with all numeric variables") ... 
2 Summary.data.frame(structure(list(Date = structure(c(279L, 285L, 
291L, 297L, 303L, 315L, 321L, 327L, 333L, 339L, 345L, 357L, 363L, 
369L, 375L, 387L, 393L, 399L, 405L, 417L, 423L, 429L, 435L, 441L, 
447L, 453L, 477L, 501L, 555L, 561L, 567L, 573L, 579L, 585L, 591L,  ... 
1 corr("specdata") 

and from my research it seems that it means that there is non-numeric data in my data set. The data set I'm using is from the Coursera course, and if that were the case, I'd assume others would be having the same problem as I am, but there doesn't seem to be any mentions in any of the discussion boards or online of a similar problem. My only guess is that it is a result of my function code, which is below:

corr <- function(directory, threshold = 0) {

vect1 <- numeric()
files_list <- list.files(directory, full.names = TRUE)

for (i in 1:332) {

    data <- read.csv(files_list[i])
    good <- complete.cases(data)
    complete_data <- data[good,]
    sulfate <- complete_data[,2]
    nitrate <- complete_data[,3]

        if (sum(complete_data) >= threshold) {
            b <- cor(sulfate,nitrate)
            vect1 <- rbind(b)
        }
        else vect1 <- (numeric())
    }
    return(vect1)
}

From the error message and the traceback I "think" the error is occurring when the correlation is running on the sulfate and nitrate columns. When I've run the code on just the first file in the directory, it runs fine with no error messages. Any help or insight as to why this error is occurring, and how to fix it would be helpful.

I have tried to coerce the dataset into being numeric -

complete_data <- as.numeric(data[good,])

but I get a different error message back "Error: (list) object cannot be coerced to type 'double'"

The answer is i can't sum the object 'complete_data.' I meant to sum the logical vector 'good', but made an error and tried to sum the wrong object. I used the nrow count of complete_data instead, and that solved my problem!

Perhaps you should be counting the number of rows in good data, instead of trying to sum an entire data frame.

if (nrow(complete_data) >= threshold) {
    b <- cor(sulfate,nitrate)
    vect1 <- rbind(b)
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM