简体   繁体   中英

Subsetting data frames in R

I'm new to R and learning about subsetting. I have a table and I'm trying to get the size of a subset of the table. My issue is that when I try two different ways I get two different answers. For a table "dat" where I'm trying to select all rows where RMS is 5 and BDS is 2:

dim(dat[(dat$RMS==5) & (dat$BDS==2),])

gives me a different answer than

dim(subset(dat,(dat$RMS==5) & (dat$BDS==2)))

The second one is correct, could someone explain why these are different and why the first one is giving me the wrong answer?

Thanks

The reason must be in different treatment of NA values by these two methods. If you remove rows with NA from the data frame you should get the same results:

dat_clean = na.omit(dat)

Works for me.....

> x = c(1,1,2,2,3,3)
> y = c(4,4,5,5,6,6)
> 
> X = data.frame(x,y)
> 
> dim(X[X$x==1 & X$y==4,])
  [1] 2 2
> 
> (X[X$x==1 & X$y==4,])
   x y
 1 1 4
 2 1 4

> dim(subset(X,(X$x==1) & (X$y==4)))
  [1] 2 2
> subset(X,(X$x==1) & (X$y==4))
  x y
1 1 4
2 1 4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM