简体   繁体   中英

Pairing columns and making dataframe from wide to long in r

I have this kind of dataframe :

id institution  name_a      info_a bfullname      idb
1  A            Chet Baker  666    Clifford Brown 123

I need to reshape it, keeping the id , institution and pair the columns keeping the values like this:

id institution role       name           id_name
1  A           student    Chet Baker     666
1  A           teacher    Clifford Brown 123

The role column is defined by the column name , which I have an id vector identifying like this:

value     id 
name_a    student
bfullname teacher

The problem is that I have a lot of columns with different names, I needed an way to specify which ones go along with another, or maybe a solution that I could rename the columns and do so.

I've seen a lot of reshape , dcast , melt and so on topics but still couldn't figure it out

Any ideas how to do it?

Forget reshape , use tidyr :

require(dplyr)
require(tidyr)

df <- tribble(
~id, ~institution,  ~name_a,      ~info_a, ~bfullname,      ~idb,
1,  "A",            "Chet Baker",  666,    "Clifford Brown", 123,
2,  "B",            "George Baker",  123,    "Charlie Brown", 234,
3,  "C",            "Banket Baker",  456,    "James Brown", 647,
4,  "D",            "Koeken Baker",  789,    "Golden Brown", 967
)

def <- tribble(~value, ~roleid, ~info,
"name_a",    "student", "info_a", 
"bfullname", "teacher", "idb")


def    

dflong <- df %>%
  gather(key, value, -id, -institution)


dflong %>%
  filter(key %in% def$value) %>%
  rename(role = key, name = value) %>%
  inner_join(def, by = c('role' = 'value')) %>%
  left_join(dflong %>% select(- institution), by = c('id' = 'id','info' = 'key'))

Which will result in:

# A tibble: 8 x 7
id institution role      name           roleid  info   value
      <dbl> <chr>       <chr>     <chr>          <chr>   <chr>  <chr>
1     1 A           name_a    Chet Baker     student info_a 666  
2     2 B           name_a    George Baker   student info_a 123  
3     3 C           name_a    Banket Baker   student info_a 456  
4     4 D           name_a    Koeken Baker   student info_a 789  
5     1 A           bfullname Clifford Brown teacher idb    123  
6     2 B           bfullname Charlie Brown  teacher idb    234  
7     3 C           bfullname James Brown    teacher idb    647  
8     4 D           bfullname Golden Brown   teacher idb    967  
library(data.table)

setDT(df)

melt(
  df,
  id.vars = 1:2, 
  measure.vars = list(name = c(3, 5), id_name = c(4, 6)),
  variable.name = "role"
)
#>    id institution role           name id_name
#> 1:  1           A    1     Chet Baker     666
#> 2:  1           A    2 Clifford Brown     123

Where df is:

df <- read.table(text = '
id institution  name_a      info_a bfullname      idb
1  A            "Chet Baker"  666    "Clifford Brown" 123
', header = TRUE)

Created on 2019-02-14 by the reprex package (v0.2.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM