简体   繁体   中英

How can I loop a data matrix in R?

I am trying to loop a data matrix for each separate ID tag, “1”, “2” and “3” (see my data at the bottom). Ultimately I am doing this to transform the X and Y coordinates into a timeseries with the ts() function, but first i need to build a loop into the function that returns a timeseries for each separate ID. The looping itself works perfectly fine when I use the following code for a dataframe:

for(i in 1:3){ print(na.omit(xyframe[ID==i,])) }

Returning the following output:

 Timestamp X Y ID  
 1. 0 -34.012 3.406 1  
 2. 100 -33.995 3.415 1  
 3. 200 -33.994 3.427 1

 Timestamp       X     Y ID  
 4.          0 -34.093 3.476 2  
 5.        100 -34.145 3.492 2  
 6.        200 -34.195 3.506 2  

   Timestamp       X     Y ID  
 7.         0 -34.289 3.522 3  
 8.       100 -34.300 3.520 3  
 9.       200 -34.303 3.517 3  

Yet, when I want to produce a loop in a matrix with the same code:

for(i in 1:3){ print(na.omit(xymatrix[ID==i,]) }

It returns the following error:

Error in print(na.omit(xymatrix[ID == i, ]) : 
  (subscript) logical subscript too long

Why does it not work to loop the ID through a matrix while it does work for the dataframe and how would I be able to fix it? Furthermore did I read that looping requires much more computational strength then doing the same thing vector based, would there be a way to do this vector based?

The data (simplification of the real data):

 Timestamp X Y ID  
 1.   0 -34.012 3.406 1  
 2. 100 -33.995 3.415 1  
 3. 200 -33.994 3.427 1  
 4.   0 -34.093 3.476 2  
 5. 100 -34.145 3.492 2  
 6. 200 -34.195 3.506 2  
 7.   0 -34.289 3.522 3  
 8. 100 -34.300 3.520 3  
 9. 200 -34.303 3.517 3 

The format xymatrix[ID==i,] doesn't work for matrix. Try this way:

for(i in 1:3){ print(na.omit(xymatrix[xymatrix[,'ID'] == i,])) }

In general, if you want to apply a function to a data frame, split by some factor, then you should be using one of the apply family of functions in combination with split .

Here's some reproducible sample data.

n <- 20  
some_data <- data.frame(
  x = sample(c(1:5, NA), n, replace= TRUE), 
  y = sample(c(letters[1:5], NA), n, replace= TRUE),
  id = gl(3, 1, length = n)
)

If you want to print out the rows with no missing values, split by each ID level, then you want something like this.

lapply(split(some_data, some_data$grp), na.omit)

or more concisely using the plyr package.

library(plyr)
dlply(some_data, .(grp), na.omit)

Both methods return output like this

# $`1`
   # x y grp
# 1  2 d   1
# 4  3 e   1
# 7  3 c   1
# 10 4 a   1
# 13 2 e   1
# 16 3 a   1
# 19 1 d   1

# $`2`
  # x y grp
# 2 1 e   2
# 5 3 e   2
# 8 3 b   2

# $`3`
   # x y grp
# 6  3 c   3
# 9  5 a   3
# 12 2 c   3
# 15 2 d   3
# 18 4 a   3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM