Converting from a character to a numeric data frame

I have a character data frame in R which has NaN s in it. I need to remove any row with a NaN and then convert it to a numeric data frame.

If I just do as.numeric on the data frame, I run into the following

Error: (list) object cannot be coerced to type 'double'

As @thijs van den bergh points you to,

dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)

dat <- as.data.frame(sapply(dat, as.numeric)) #<- sapply is here

dat[complete.cases(dat), ]
#  x y
#2 2 3

Is one way to do this.

Your error comes from trying to make a data.frame numeric. The sapply option I show is instead making each column vector numeric.

Note that data.frames are not numeric or character , but rather are a list which can be all numeric columns, all character columns, or a mix of these or other types (eg: Date / logical ).

dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)
# [1] TRUE

The example data just has two character columns:

> str(dat)
'data.frame':   2 obs. of  2 variables:
 $ x: chr  "NaN" "2"
 $ y: chr  "NaN" "3

...which you could add a numeric column to like so:

> dat$num.example <- c(6.2,3.8)
> dat
    x   y num.example
1 NaN NaN         6.2
2   2   3         3.8
> str(dat)
'data.frame':   2 obs. of  3 variables:
 $ x          : chr  "NaN" "2"
 $ y          : chr  "NaN" "3"
 $ num.example: num  6.2 3.8

So, when you try to do as.numeric R gets confused because it is wondering how to convert this list object which may have multiple types in it. user1317221_G 's answer uses the ?sapply function, which can be used to apply a function to the individual items of an object. You could alternatively use ?lapply which is a very similar function (read more on the *apply functions here - R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate )

Ie - in this case, to each column of your data.frame , you can apply the as.numeric function, like so:


The lapply call is wrapped in a data.frame to make sure the output is a data.frame and not a list . That is, running:


will give you:

> lapply(dat,as.numeric)
[1] NaN   2

[1] NaN   3

[1] 6.2 3.8



will give you:

>  data.frame(lapply(dat,as.numeric))
    x   y num.example
1 NaN NaN         6.2
2   2   3         3.8

