简体   繁体   中英

When is a data.frame in R numeric?

I stumble on the following problem. I have a data.frame

A <- data.frame(let = c("A", "B", "C"), x = 1:3, y = 4:6)

The classes of its columns are

sapply(A, class)
      let         x         y 
 "factor" "integer" "integer" 
s.numeric(A$x)
[1] TRUE
is.numeric(A)
[1] FALSE

I do not understand why although A$x and B$x are numeric, the data.frame composed only by these two columns is not numeric

is.numeric(A[, c("x", "y")])
[1] FALSE

Removing the factor column does not help...

B <- A
B$let <- NULL
is.numeric(B)
[1] FALSE
is.numeric(B$x)
[1] TRUE
is.numeric(B$y)
[1] TRUE

So, I tried creating a new dataset built only with the numeric columns in A . Is it numeric? No...

C <- data.frame(B$x, B$y)
is.numeric(C)
[1] FALSE
C <- data.frame(as.numeric(B$x), as.numeric(B$y))
is.numeric(C)
[1] FALSE

There must be something I'm missing here. Any help?

We need to apply the function on the vector and not on the data.frame

sapply(A[c("x", "y")], is.numeric)

instead of

is.numerc(A)

as according to ?is.numeric

Methods for is.numeric should only return true if the base type of the class is double or integer and values can reasonably be regarded as numeric (eg, arithmetic on them makes sense, and comparison should be done via the base type).

The class of 'A' is data.frame and is not numeric

class(A)
#[1] "data.frame"

sapply(A, class)

is.numeric returns TRUE only if the class of the object is numeric or integer .


Thus, a data.frame can never be numeric unless we apply the is.numeric on the vector or the extracted column. That is the reason, we do it on a loop with lapply/sapply where we get the column as a vector and its class would be the class of that column

A data frame is always a data frame, independent of the classes of its columns. So what you get is the expected behaviour

If you want to check whether all columns in a data frame are numeric, you can use the following code

all(sapply(A, is.numeric))
## [1] FALSE
all(sapply(A[, c("x", "y")], is.numeric))
## [1] TRUE

A table with only numeric data can also be understood as a matrix. You can convert the numeric columns of your data frame to a matrix as follows:

M <- as.matrix(A[, c("x", "y")])
M
##      x y
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6

The matrix M is now really numeric:

is.numeric(M)
## [1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM