简体   繁体   中英

reshaping data with time represented as spells

I have a dataset in which time is represented as spells (ie from time 1 to time 2), like this:

d <- data.frame(id = c("A","A","B","B","C","C"),
                t1 = c(1,3,1,3,1,3),
                t2 = c(2,4,2,4,2,4),
                value = 1:6)

I want to reshape this into a panel dataset, ie one row for each unit and time period, like this:

result <- data.frame(id = c("A","A","A","A","B","B","B","B","C","C","C","C"),
                     t= c(1:4,1:4,1:4),
                     value = c(1,1,2,2,3,3,4,4,5,5,6,6))

I am attempting to do this with tidyr and gather but not getting the desired result. I am trying something like this which is clearly wrong:

gather(d, 't1', 't2', key=t)

In the actual dataset the spells are irregular.

So, the goal is to melt t1 and t2 columns and to drop the key column that will appear as a result. There are a couple of options. Base R's reshape seems to be tedious. We may, however, use melt :

library(reshape2)
melt(d, measure.vars = c("t1", "t2"), value.name = "t")[-3]
#    id value t
# 1   A     1 1
# 2   A     2 3
# 3   B     3 1
# 4   B     4 3
# 5   C     5 1
# 6   C     6 3
# 7   A     1 2
# 8   A     2 4
# 9   B     3 2
# 10  B     4 4
# 11  C     5 2
# 12  C     6 4

where -3 drop the key column. We may indeed also use gather as in

gather(d, "key", "t", t1, t2)[-3]
#    id value t
# 1   A     1 1
# 2   A     2 3
# 3   B     3 1
# 4   B     4 3
# 5   C     5 1
# 6   C     6 3
# 7   A     1 2
# 8   A     2 4
# 9   B     3 2
# 10  B     4 4
# 11  C     5 2
# 12  C     6 4

You were almost there.

Code

d %>%
    # Gather the needed variables. Explanation:
    # t_type: How will the call the column where we will put the former
    #         variable names under?
    # t:      How will we call the column where we will put the
    #         values of above variables?
    # -id,
    # -value: Which columns should stay the same and NOT be gathered
    #         under t_type (key) and t (value)?
    # 
    gather(t_type, t, -id, -value) %>%
    # Select the right columns in the right order. 
    # Watch out: We did not select t_type, so it gets dropped.
    select(id, t, value) %>%
    # Arrange / sort the data by the following columns.
    # For a descending order put a "-" in front of the column name.  
    arrange(id, t)

Result

   id t value
1   A 1     1
2   A 2     1
3   A 3     2
4   A 4     2
5   B 1     3
6   B 2     3
7   B 3     4
8   B 4     4
9   C 1     5
10  C 2     5
11  C 3     6
12  C 4     6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM