简体   繁体   中英

recode column values using dplyr

I am having trouble (likely because I am new to dplyr) trying to recode values. I am trying to split participants up by number, then recode the day value as 1 and so on. Currently it is day of the month.... my goal is to make it day of experiment. Note: The first date listed for a participant should be day 1 for them.

My attempt:

df<-data.frame(participant_number=c(1,1,1,2,2),month=c(3,3,4,3,3),day=c(6,6,1,7,8))
res<-setDT(df) %>% group_by(participant_number) %>% day 

My goal:

participant_number day   month  recoded_day
1                  6       3     1
1                  6       3     1
1                  1       4     2
2                  7       3     1
2                  8       3     2

I see setDT() in your code, so here's a complete data.table solution in case you are interested.

library(data.table)
setDT(df)[, 
    recoded_day := cumsum(c(1, diff(as.IDate(paste(month, day), "%m %d")))), 
    by = participant_number
]

which gives us

   participant_number month day recode_day
1:                  1     3   6          1
2:                  1     3   6          1
3:                  1     4   1         27
4:                  2     3   7          1
5:                  2     3   8          2

You could try:

library(dplyr)
df %>% group_by(participant_number) %>%
       mutate(recoded_day = day - day[1] + 1) 

Source: local data frame [5 x 3]
Groups: participant_number [2]

  participant_number   day recoded_day
               (dbl) (dbl)       (dbl)
1                  1     6           1
2                  1     6           1
3                  1     7           2
4                  2     7           1
5                  2     8           2

EDIT: If you have months and days, first make it into a date format (NB you need a year, especially if leap years are involved):

df$date <- as.Date(paste(df$month, df$day, "2015"), format = "%m %d %Y") 

Then use the same code on this new date column:

df %>% group_by(participant_number) %>%
       mutate(recoded_day = as.numeric(date - date[1] + 1)) 

Source: local data frame [5 x 5]
Groups: participant_number [2]

  participant_number month   day       date recoded_day
               (dbl) (dbl) (dbl)     (date)       (dbl)
1                  1     3     6 2015-03-06           1
2                  1     3     6 2015-03-06           1
3                  1     4     1 2015-04-01          27
4                  2     3     7 2015-03-07           1
5                  2     3     8 2015-03-08           2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM