简体   繁体   中英

Turn specific row values into columns using R

I have a dataset similar to the one listed below

v1 <-c('name1','0/0','0/1','1/1','name2','0/0','1/1','name3','0/0','0/1','1/1','name4','0/0','0/1','1/1','name5','0/0','1/1')
v2 <- c(NA,95,3,2,NA,98,2,NA,93,5,2,NA,94,3,3,NA,96,4)
df <- cbind(v1,v2)
df <- as.data.frame(df)
df

which looks like:

v1 v2
name1 NA
0/0 95
0/1 3
1/1 2
name2 NA
0/0 98
1/1 2
name3 NA
0/0 93
0/1 5
1/1 2

How can I reshape the dataframe to look like:

names 0/0 0/1 1/1
name1 95 3 2
name2 98 NA 2
name3 93 5 2

Using reshape the closest I have gotten is:

v1 0/0 0/1 1/1
1 95 . .
2 . 3 .
3 . . 2
4 98 . .
5 . . 2

Thank you!

If you are willing to use the tidyverse group of packages rather than reshape2 , you can try this. The steps are

  1. Create a new names column and fill it with any value from v1 that contains the string "names"
  2. fill the new names column, and filter out any rows which have an NA in the v2 column
  3. Use tidyr::pivot_wider to change to a wide format dataframe
v1 <-c('name1','0/0','0/1','1/1','name2','0/0','1/1','name3','0/0','0/1','1/1','name4','0/0','0/1','1/1','name5','0/0','1/1')
v2 <- c(NA,95,3,2,NA,98,2,NA,93,5,2,NA,94,3,3,NA,96,4)
df <- cbind(v1,v2)
df <- as.data.frame(df)

library(tidyverse)

df %>%
  mutate(names = ifelse(str_detect(v1, 'name'), v1, NA)) %>%
  fill(names) %>%
  filter(!is.na(v2)) %>%
  pivot_wider(names_from = v1,
              values_from = v2)
#> # A tibble: 5 x 4
#>   names `0/0` `0/1` `1/1`
#>   <chr> <chr> <chr> <chr>
#> 1 name1 95    3     2    
#> 2 name2 98    <NA>  2    
#> 3 name3 93    5     2    
#> 4 name4 94    3     3    
#> 5 name5 96    <NA>  4

Created on 2021-08-20 by the reprex package (v2.0.0)

How it works:

  1. create a id_Group
  2. group_split by id_Group
  3. now you have a list of dataframes my_list
  4. use lapply to pivot_wider all dataframes in your list
  5. use bind_rows to combine the list of dataframes to one dataframe
  6. to get the name column use pivot_longer with some subsequent tweaking:
library(tidyverse)

my_list <- df %>% 
    mutate(id_Group = cumsum(is.na(v2))) %>% 
    group_split(id_Group) 

df_list <- lapply(1:length(my_list), 
                  function(x) (pivot_wider(my_list[[x]], names_from = v1, values_from = v2)))

bind_rows(df_list) %>% 
    pivot_longer(
        cols = starts_with("name"),
        names_to = "name"
    ) %>% 
    group_by(id_Group) %>% 
    filter(row_number()==1) %>% 
    ungroup() %>% 
    select(name, contains("/"), -id_Group, -value)

  name  `0/0` `0/1` `1/1`
  <chr> <chr> <chr> <chr>
1 name1 95    3     2    
2 name1 98    NA    2    
3 name1 93    5     2    
4 name1 94    3     3    
5 name1 96    NA    4    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM