简体   繁体   中英

in R iterating through subsetted data

I am attempting an assignment for courser, so this is homework. I am hoping someone will explain why what I am doing does not work. I have a data frame called complete_cases and I have to report back how many records there are in specified 'sets' of observations from a much larger 'set' The data are in the format:

              Date sulfate nitrate ID
279 2003-10-06    7.21   0.651  1
285 2003-10-12    5.99   0.428  1
291 2003-10-18    4.68   1.040  1
297 2003-10-24    3.47   0.363  1
303 2003-10-30    2.42   0.507  1
315 2003-11-11    1.43   0.474  1

and so on for 332 different sets with the id 1 to 332. I have 'found' the instances in which the record is complete and have to return which set the data are from and how many complete sets of data there are in the specified set( by id) I am trying to use:

for (i in id){
   list <- nrow(complete_cases[i])
   data<-cbind(id = i,  nobs= list)
  }    

data If I call the function using one set of data, it appears to work fine: gives me:

      id nobs
[1,]  1  117

but trying to apply it to an id <- c(2,4,8,10,12) gives me the error:

Error in `[.data.frame`(complete_cases, i) : undefined columns selected

So what I was expecting is that the iteration would return the number of rows for each id in c(2,4,8,10,12) and return the id and the size for each id. Is this any clearer?

Your problem is with the way you are subsetting the data, in order to specify that the column ID should be the one referenced by the iterator value you must be more specific. There a number of ways to do this, here is one:

complete_cases[complete_cases$ID == i, ]

You also are going to be writing over your vector every time by just using data <- ... my personal favorite, which does not require you to specify the dimension of your final set, goes like this:

data_summary <- vector("list")
k <- 1
for (i in id){
   current_id_rowcount <- nrow(complete_cases[complete_cases$ID == i, ])
   data_summary[[k]] <-cbind(id = i,  nobs=current_id_rowcount)
   k <- k + 1
}    
final <- do.call(rbind, data_summary)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM