简体   繁体   中英

Subtract values between positions in matrix

In the matrix below I want to output differences in change-values based on differences in Positions. Example: For ID1 subtract the average change-value where Position1=1 by the change-value where Position1=0

Output for ID1 Position1

 Position1= average(0.59-0.04+0.37) - average(-0.18)


 IDs   Change       Position1    Position2
 ID1   0.5941262037     1           1    
 ID1  -0.0418420656     1           1   
 ID1   0.3766006166     1           1   
 ID1  -0.1842130385     0           0   
 ID2  -1.3847740208     0           0   
 ID2  -1.2668185169     0           1   
 ID2   1.8034297622     1           1   
 ...

EDIT:

My output should be one value for every ID at every Position.

ID1-Position1:

ID2-Position2:

You could use dplyr with tidyr for multiple Position columns

 library(dplyr)
 library(tidyr)

  dat %>% 
     gather(Var, Val, starts_with("Position")) %>% 
     group_by(IDs, Var) %>% 
     summarise(Mean=mean(Change[!!Val], na.rm=TRUE)-mean(Change[!Val], na.rm=TRUE)) %>%
     spread(Var, Mean)

which gives

   # IDs Position1 Position2
  #1 ID1 0.4938413 0.4938413
  #2 ID2 3.1292260 1.6530796

Or, you could use data.table with reshape2

  library(reshape2)
  library(data.table)

  DT <-  data.table(melt(dat, id.var=c("IDs", "Change")), key=c("IDs", "variable"))
  dcast(DT[, list(mean(Change[!!value], na.rm=TRUE)-mean(Change[!value], na.rm=TRUE)),
                 by=list(IDs, variable)], 
                          IDs~variable, value.var="V1")
   #  IDs Position1 Position2
   #1 ID1 0.4938413 0.4938413
   #2 ID2 3.1292260 1.6530796

Or using base R

   do.call(`rbind`,
        lapply(split(dat[,-1], dat$IDs), 
              function(x) {
                 apply(x[,-1], 2, function(y) mean(x[,1][!!y], na.rm=TRUE)-
                                               mean(x[,1][!y], na.rm=TRUE))}))
  #  Position1 Position2
  #ID1 0.4938413 0.4938413
  #ID2 3.1292260 1.6530796

data

 dat <- structure(list(IDs = c("ID1", "ID1", "ID1", "ID1", "ID2", "ID2", 
 "ID2"), Change = c(0.5941262037, -0.0418420656, 0.3766006166, 
 -0.1842130385, -1.3847740208, -1.2668185169, 1.8034297622), Position1 = c(1L, 
 1L, 1L, 0L, 0L, 0L, 1L), Position2 = c(1L, 1L, 1L, 0L, 0L, 1L, 
 1L)), .Names = c("IDs", "Change", "Position1", "Position2"), class = "data.frame",   row.names = c(NA, 
 -7L))

Splitting the data frame according to IDs and doing the required operation for each ID seems to be the most straightforward way.

library(plyr)

X <- data.frame(IDs = c(1,1,1,1,2,2,2), change = 1:7, Position1 = c(1,1,1,0,0,0,1))

Y <- ddply(X, "IDs", function(df) {
  change.diff <-  mean(subset(df,Position1==1)$change) - 
                  mean(subset(df,Position1==0)$change)
})

Y
#    IDs   V1
# 1   1   -2.0
# 2   2    1.5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM