简体   繁体   中英

How to repeat part of a row in a data frame

Here is a small example of the data frame I have:

data <- data.frame(station=rep(c(1,1,2),each=4), month=rep(c(2,3,2),each=4), day=rep(c(26:29),3),times=rep(c(1:4),3),place=c(1:8,1:4),V1=rep(9:12,3),V2=rep(9:12,3)) 

And this is the data frame I need:

data1 <- data.frame(station=rep(c(1,1,2),each=4), month=rep(c(2,3,2),each=4), day=rep(c(26:29),3),times=rep(c(1:4),3),place=c(1:8,1:4),V1=c(9,10,10,10,9:12,9,10,10,10),V2=c(9,10,10,10,9:12,9,10,10,10)) 

What I need to do is to repeat column V1 and V2 of February 28 & 29 to Feb 27th, because the original data has 300 stations and 60 years, I tried following but doesn't work:

data1 <- ddply(data, .(station, month, times),function(x) x[x[3:4,2]==2,6:7] <- x[2,6:7])

Any advice would be appreciated, thanks

Here is how you do. If you have many cols, you can use lapply but here I don't use since you have only two cols

data$V1[data[,3] %in% c(28,29) & data[,2] %in% c(2) ]<-data$V1[data[,3] %in% c(27) & data[,2] %in% c(2)]
data$V2[data[,3] %in% c(28,29) & data[,2] %in% c(2) ]<-data$V2[data[,3] %in% c(27) & data[,2] %in% c(2)]

If you need to use multiple cols, here is the solution:

   do.call(cbind,lapply(data[,6:7],function (x) {x[data[,3] %in% c(28,29) & data[,2] %in% c(2) ]<-x[data[,3] %in% c(27) & data[,2] %in% c(2)]
                                               x})
          )
      V1 V2
 [1,]  9  9
 [2,] 10 10
 [3,] 10 10
 [4,] 10 10
 [5,]  9  9
 [6,] 10 10
 [7,] 11 11
 [8,] 12 12
 [9,]  9  9
[10,] 10 10
[11,] 10 10
[12,] 10 10

Note: Instead of data[,6:7] you can select the cols that yow want to replace, all others remain the same.

This is essentially a "last observation carried forward" problem, and as such the zoo package is helpful. Set everything on the 28 th or 29 th of February as NA, and then carry forward the values from the 27 th using na.locf

library(zoo)
data[c("V1","V2")][data$day %in% c(28,29) & data$month %in% c(2),] <- NA
keyvals <- data[c("V1","V2")][data$day %in% c(27,28,29) & data$month %in% c(2),]
data[c("V1","V2")][data$day %in% c(27,28,29) & data$month %in% c(2),] <- na.locf(keyvals)

Result:

> data
   station month day times place V1 V2
1        1     2  26     1     1  9  9
2        1     2  27     2     2 10 10
3        1     2  28     3     3 10 10
4        1     2  29     4     4 10 10
5        1     3  26     1     5  9  9
6        1     3  27     2     6 10 10
7        1     3  28     3     7 11 11
8        1     3  29     4     8 12 12
9        2     2  26     1     1  9  9
10       2     2  27     2     2 10 10
11       2     2  28     3     3 10 10
12       2     2  29     4     4 10 10

> all.equal(data,data1)
[1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM