[英]Create columns in R data.table based on sub-strings of existing column
[英]create data.table column based on content of strings in other columns in r
我正在使用與以下類似的數據進行工作/爭吵/詛咒:
names <- data.table(namesID = 1:3, fullName = c("bob so", "larry po", "sam ho"))
trips <- data.table(tripsID = 1:3, tripNames= c("Mexico", "Alaska", "New Jersey"),
tripMembers = c("bob so|larry po|sam ho","bob so|sam ho", "bob so|larry po")
)
我想創建一個這樣的新表,獲取tripMembers,並將正確的nameID連接到正確的tripID和tripName。 我想這是一個連接(嘗試了很多連接)?
namesTrips
tripsID tripNames namesID
1 "Mexico" 1
1 "Mexico" 2
1 "Mexico" 3
2 "Alaska" 1
2 "Alaska" 3
3 "New Jersey" 1
3 "New Jersey" 2
你可以這樣做:
# split the tripMembers column and unnest it; then join with names on the tripMembers
namesTrips <- trips[, .(tripMembers = unlist(strsplit(tripMembers, "\\|"))),
by = .(tripsID, tripNames)][names, on = .(tripMembers = fullName)]
namesTrips[, tripMembers := NULL][order(tripsID)]
# tripsID tripNames namesID
#1: 1 Mexico 1
#2: 1 Mexico 2
#3: 1 Mexico 3
#4: 2 Alaska 1
#5: 2 Alaska 3
#6: 3 New Jersey 1
#7: 3 New Jersey 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.