简体   繁体   中英

How do I collapse rows to fill NAs in groups with uneven number of rows per column?

I'm trying to format this data to feed into a formattable table.

Sample data:

test <- tibble(group = c("A", "B", "B", "B", "C", "D"), 
           name = c("Steve", "Mike", "Paul", "Jaimie", "John", "Marco"),
           year = c(1988, 2001, 2001, 2002, 1997, 2000))

which gives:

# A tibble: 6 x 3
  group name    year
  <chr> <chr>  <dbl>
1 A     Steve   1988
2 B     Mike    2001
3 B     Paul    2001
4 B     Jaimie  2002
5 C     John    1997
6 D     Marco   2000

I want the formattable table to have group and year as columns so I spread by:

test %>% 
  mutate(grouped_id = row_number()) %>%
  spread(key = year, value = name) 

which gives:

# A tibble: 6 x 7
  group grouped_id `1988` `1997` `2000` `2001` `2002`
  <chr>      <int> <chr>  <chr>  <chr>  <chr>  <chr> 
1 A              1 Steve  NA     NA     NA     NA    
2 B              2 NA     NA     NA     Mike   NA    
3 B              3 NA     NA     NA     Paul   NA    
4 B              4 NA     NA     NA     NA     Jaimie
5 C              5 NA     John   NA     NA     NA    
6 D              6 NA     NA     Marco  NA     NA    

For group "B", I want "Jaimie" in column 2002 to collapse to the first row of group "B" so that group "B" doesn't have three rows. The result should look like this:

# A tibble: 6 x 7
  group grouped_id `1988` `1997` `2000` `2001` `2002`
  <chr>      <int> <chr>  <chr>  <chr>  <chr>  <chr> 
1 A              1 Steve  NA     NA     NA     NA    
2 B              2 NA     NA     NA     Mike   Jaimie    
3 B              3 NA     NA     NA     Paul   NA    
4 C              5 NA     John   NA     NA     NA    
5 D              6 NA     NA     Marco  NA     NA    

I've tried this and this .

Thanks for your help.

We can do a group by row_number

library(dplyr)
library(tidyr)
test %>%
   group_by(year) %>% 
   mutate(rn = row_number()) %>%
   ungroup %>%
   spread(year, name) %>%
   select(-rn)
# A tibble: 5 x 6
#  group `1988` `1997` `2000` `2001` `2002`
#  <chr> <chr>  <chr>  <chr>  <chr>  <chr> 
#1 A     Steve  <NA>   <NA>   <NA>   <NA>  
#2 B     <NA>   <NA>   <NA>   Mike   Jaimie
#3 B     <NA>   <NA>   <NA>   Paul   <NA>  
#4 C     <NA>   John   <NA>   <NA>   <NA>  
#5 D     <NA>   <NA>   Marco  <NA>   <NA>  

In the newer version of tidyr , it is better to use pivot_wider

test %>%
       group_by(year) %>% 
       mutate(rn = row_number()) %>%
       ungroup %>%
       pivot_wider(names_from = year, values_from = name) %>%
       select(-rn)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM