简体   繁体   中英

Mapping multiple columns using dplyr or tidyverse in R

My data are similar to the following data. Note some two surnames or names which I did not mentioned here

 df1<-read.table(text= "id Surname Name 1234 Hamidsar Smith 139 Sandarscom Vicky 234 Bates May 100 Murphu Beki 941 Jool Susan 469 Sali John 990 susai Alison",header=TRUE)

The another data frame is as follows:

 df2<-read.table(text= "id Surname Name 990 susai Anis 81B Rosak Roy 340 Molipoor Vicky 139 Bates May 941 Ameri David 990 susai Alison 139 Bates May 101 CICI Beki 139 Sandarscom Vicky 1234 Hamidsar Smith",header=TRUE)

I want to map id , surname and name to get the following table

 id Surname Name Map 1234 Hamidsar Smith id,Surname,Name 139 Sandarscom Vicky id,Surname,Name 234 Bates May Surname, Name 100 Murphu Beki Name 941 Jool Susan id 469 Sali John NA 990 susai Alison id,Surname,Name

Is it possible to get it ? I read some links, but they did not help me

  library(tidyverse)
  library(glue)

First, I'm setting stringsAsFactors=FALSE to your code to produce the data to avoid pain down the road.

  df1<-read.table(text= "id Surname Name
1234    Hamidsar    Smith
139 Sandarscom  Vicky
234 Bates   May
100 Murphu  Beki
941 Jool    Susan
469 Sali    John
990 susai   Alison",header=TRUE, stringsAsFactors=FALSE)

  df2<-read.table(text= "id Surname Name
990 susai   Anis
81B Rosak   Roy
340 Molipoor    Vicky
139 Bates   May
941 Ameri   David
990 susai   Alison
139 Bates   May
101 CICI    Beki
139 Sandarscom  Vicky
1234    Hamidsar    Smith",header=TRUE, stringsAsFactors=FALSE)

Below I ensure id is set to character type so that it's the same in df2 and then I'm just applying a lookup to say "if this value is in df2 , show the name of the variable, otherwise nothing". Then I'm using glue to put them together into a Map variable.

  df1 %>% 
    mutate(id = as.character(id),
           Map_id = if_else(id %in% df2$id, "id", NA_character_),
           Map_Surname = if_else(Surname %in% df2$Surname, "Surname", NA_character_),
           Map_Name = if_else(Name %in% df2$Name, "Name", NA_character_),
           Map = glue("{Map_id} {Map_Surname} {Map_Name}", .na = "") %>% 
             str_trim() %>% 
             str_replace_all(" ", ", ")) %>% 
    select(id, Surname, Name, Map)

#>     id    Surname   Name               Map
#> 1 1234   Hamidsar  Smith id, Surname, Name
#> 2  139 Sandarscom  Vicky id, Surname, Name
#> 3  234      Bates    May     Surname, Name
#> 4  100     Murphu   Beki              Name
#> 5  941       Jool  Susan                id
#> 6  469       Sali   John                  
#> 7  990      susai Alison id, Surname, Name

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM