简体   繁体   中英

create data.table column based on content of strings in other columns in r

I'm working/wrangling/cursing with data similar to these:

names <- data.table(namesID = 1:3, fullName =  c("bob so", "larry po", "sam ho"))

trips <- data.table(tripsID = 1:3, tripNames= c("Mexico", "Alaska", "New Jersey"), 
                    tripMembers = c("bob so|larry po|sam ho","bob so|sam ho", "bob so|larry po")
                   ) 

I want to create a new table like this, taking the tripMembers, and connecting the correct nameID to the correct tripID and tripName. I guess this is a join (tried many joins)?

namesTrips 
tripsID   tripNames          namesID
1         "Mexico"            1
1         "Mexico"            2
1         "Mexico"            3
2         "Alaska"            1
2         "Alaska"            3
3         "New Jersey"        1
3         "New Jersey"        2

You can do something like this:

# split the tripMembers column and unnest it; then join with names on the tripMembers
namesTrips <- trips[, .(tripMembers = unlist(strsplit(tripMembers, "\\|"))), 
                      by = .(tripsID, tripNames)][names, on = .(tripMembers = fullName)]

namesTrips[, tripMembers := NULL][order(tripsID)]

#   tripsID  tripNames namesID
#1:       1     Mexico       1
#2:       1     Mexico       2
#3:       1     Mexico       3
#4:       2     Alaska       1
#5:       2     Alaska       3
#6:       3 New Jersey       1
#7:       3 New Jersey       2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM