[英]Create columns in R data.table based on sub-strings of existing column
[英]create data.table column based on content of strings in other columns in r
我正在使用与以下类似的数据进行工作/争吵/诅咒:
names <- data.table(namesID = 1:3, fullName = c("bob so", "larry po", "sam ho"))
trips <- data.table(tripsID = 1:3, tripNames= c("Mexico", "Alaska", "New Jersey"),
tripMembers = c("bob so|larry po|sam ho","bob so|sam ho", "bob so|larry po")
)
我想创建一个这样的新表,获取tripMembers,并将正确的nameID连接到正确的tripID和tripName。 我想这是一个连接(尝试了很多连接)?
namesTrips
tripsID tripNames namesID
1 "Mexico" 1
1 "Mexico" 2
1 "Mexico" 3
2 "Alaska" 1
2 "Alaska" 3
3 "New Jersey" 1
3 "New Jersey" 2
你可以这样做:
# split the tripMembers column and unnest it; then join with names on the tripMembers
namesTrips <- trips[, .(tripMembers = unlist(strsplit(tripMembers, "\\|"))),
by = .(tripsID, tripNames)][names, on = .(tripMembers = fullName)]
namesTrips[, tripMembers := NULL][order(tripsID)]
# tripsID tripNames namesID
#1: 1 Mexico 1
#2: 1 Mexico 2
#3: 1 Mexico 3
#4: 2 Alaska 1
#5: 2 Alaska 3
#6: 3 New Jersey 1
#7: 3 New Jersey 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.