简体   繁体   中英

extract rows from data.frame with only numeric values

To my surprise, I couldn't find this being asked before.

My input data.frame

a <- c(1,5,3,1,-8,6,-1)
b <- c(4,-2,1.0,"1 2","-","1.2.3","x")
df <- data.frame(a,b)
df
   a     b
1  1     4
2  5    -2
3  3     1
4  1   1 2
5 -8     -
6  6 1.2.3
7 -1     x

Desired output

  a  b
1 1  4
2 5 -2
3 3  1

What I came up with

df[apply(df, 1, function(r) !any(is.na(as.numeric(r)))) ,]

It works but it throws some ugly warnings

  a  b
1 1  4
2 5 -2
3 3  1
Warning messages:
1: In FUN(newX[, i], ...) : NAs introduced by coercion
2: In FUN(newX[, i], ...) : NAs introduced by coercion
3: In FUN(newX[, i], ...) : NAs introduced by coercion
4: In FUN(newX[, i], ...) : NAs introduced by coercion

Any better idea?

I don't think the warnings are a big problem. They just tell you what you know already; that 4 character values return NA when coerced to numeric.

You could filter the data frame for only positive or negative digit values, then convert to numeric, using dplyr :

library(dplyr)
library(magrittr) # for the pipe %>%

df %>% 
  filter(grepl("^-?[[:digit:]]+$", b)) %>% 
  mutate(b = as.numeric(b))

Result:

  a  b
1 1  4
2 5 -2
3 3  1

A couple base R solutions (without warnings)

rowSums

df[ !is.na( rowSums( sapply( df, strtoi ) ) ), ]

  a  b
1 1  4
2 5 -2
3 3  1

complete.cases

df[ complete.cases( sapply( df, strtoi ) ), ]

  a  b
1 1  4
2 5 -2
3 3  1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM