简体   繁体   中英

Reshaping data with tidyr

I have data in the format:

  sample  height  width  weight
1 a       h1      w1     p1    
2 a       h2      w2     p2    
3 b       h3      w3     p3    
4 b       h4      w4     p4

where h1 and h2 are replicate measurements for the height of sample "a", h3 and h4 are replicate measurements for sample "b", etc.

I need to put replicate measurements side by side:

  sample height1 height2 width1 width2 weight1 weight2
1 a       h1      h2      w1     w2     p1      p2    
2 b       h3      h4      w3     w4     p3      p4    

I have been fiddling with gather and spread but haven't been able to get what I wanted. Any help please?

Thanks!

data

df1 <- structure(list(sample = c("a", "a", "b", "b"), height = c("h1", 
"h2", "h3", "h4"), width = c("w1", "w2", "w3", "w4"), weight = c("p1", 
"p2", "p3", "p4")), .Names = c("sample", "height", "width", "weight"
), row.names = c(NA, -4L), class = "data.frame")

We can gather to 'long' format and then spread it back to 'wide' format after creating a sequence column by group

library(tidyverse)
df1 %>%
  gather(key, val, height:weight) %>% 
  group_by(sample, key) %>% 
  mutate(n = row_number()) %>%
  unite(keyn, key, n, sep="") %>% 
  spread(keyn, val)
# A tibble: 2 x 7
# Groups:   sample [2]
#   sample height1 height2 weight1 weight2 width1 width2
#  <chr>  <chr>   <chr>   <chr>   <chr>   <chr>  <chr> 
#1 a      h1      h2      p1      p2      w1     w2    
#2 b      h3      h4      p3      p4      w3     w4    

Or another option with tidyverse

df1 %>%
    group_by(sample) %>%
    nest %>% 
    mutate(data = map(data, ~ 
                       unlist(.x) %>% 
                       as.list %>%
                       as_tibble)) %>% 
    unnest

Or we can use reshape from base R

df1$ind <- with(df1, ave(seq_along(sample), sample, FUN = seq_along))
reshape(df1, idvar= c("sample"), timevar = "ind", direction = "wide")
#   sample height.1 width.1 weight.1 height.2 width.2 weight.2
#1      a       h1      w1       p1       h2      w2       p2
#3      b       h3      w3       p3       h4      w4       p4

data

df1 <- structure(list(sample = c("a", "a", "b", "b"), height = c("h1", 
 "h2", "h3", "h4"), width = c("w1", "w2", "w3", "w4"), weight = c("p1", 
 "p2", "p3", "p4")), class = "data.frame", row.names = c(NA, -4L
  ))

Although you asked for tidyr::spread I provide a solution using data.table 's dcast

library(data.table)
setDT(df1)
dcast(df1, sample ~ rowid(sample), value.var = c("height", "width", "weight"), sep = "")
#   sample height1 height2 width1 width2 weight1 weight2
#1: a       h1      h2      w1     w2     p1      p2    
#2: b       h3      h4      w3     w4     p3      p4 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM