With the following two data frames
> d1
keystr keynum
1 abc 5
2 def 2
3 def 7
4 abc 3
> d2
HD 2 3 5 7
1 abc H I J K
2 def L M N P
I would like to insert a column d1$val that uses the string in keystr
and the number in keynum
as indices in the d2
data frame. The result should be:
> d1
keystr keynum val
1 abc 5 J
2 def 2 L
3 def 7 P
4 abc 3 I
This should be an indirect application of mapply. How can I make the code below
d1 <- data.frame("keystr"=c("abc","def","def","abc"), "keynum"=c(5,2,7,3))
d2 <- data.frame("HD"=c("abc","def"),
"2"=c("H","L"), "3"=c("I","M"),
"5"=c("J","N"), "7"=c("K","P"))
d1$val <- mapply(function(kstr,knum) d2[kstr,knum],
d1$keystr, d1$keynum )
access the entries in this (indirect) fashion?
If you are not bounded to use mapply
you can do a join:
Code:
library(tidyverse)
d1 <- data.frame("keystr"=c("abc","def","def","abc"), "keynum"=c(5,2,7,3))
d2 <- data.frame("HD"=c("abc","def"),
"2"=c("H","L"), "3"=c("I","M"),
"5"=c("J","N"), "7"=c("K","P"))
d2 %>%
gather(keynum, value, -HD) %>%
mutate(keynum = as.numeric(gsub(keynum, pattern = "X", replacement = ""))) %>%
left_join(y = ., x = d1, by = c("keystr" = "HD", "keynum"))
Output:
keystr keynum value
1 abc 5 J
2 def 2 L
3 def 7 P
4 abc 3 I
We can transform the data frame and then conduct a merge by tidyr and dplyr .
library(dplyr)
library(tidyr)
d3 <- d2 %>%
gather(keynum, letter, -HD) %>%
mutate(keynum = as.numeric(sub("X", "", keynum)))
d4 <- d1 %>%
left_join(d3, by = c("keystr" = "HD", "keynum"))
d4
# keystr keynum letter
# 1 abc 5 J
# 2 def 2 L
# 3 def 7 P
# 4 abc 3 I
DATA
Notice that I set stringsAsFactors = FALSE
when creating the data frames.
d1 <- data.frame("keystr"=c("abc","def","def","abc"), "keynum"=c(5,2,7,3),
stringsAsFactors = FALSE)
d2 <- data.frame("HD"=c("abc","def"),
"2"=c("H","L"), "3"=c("I","M"),
"5"=c("J","N"), "7"=c("K","P"),
stringsAsFactors = FALSE)
You can use d1 columns to index the character values in d2[-1] if you convert to a matrix and the cbind the column character values. It creates a two-D lookup table to which you pass indices for both row and column at the same time. Then you can also pass a two-D matrix against it to generate a vector of outputs. (Can also use 3 or 4 or higher-D indexing with R arrays to which on=e would pass 3,4 or higher number column matrices):
( m2 <- sapply(d2[ , -1], as.character) )
#------
2 3 5 7
[1,] "H" "I" "J" "K"
[2,] "L" "M" "N" "P"
rownames(m2) <- as.character(d2[[1]])
m2
#--------
2 3 5 7
abc "H" "I" "J" "K"
def "L" "M" "N" "P"
(d1$val <- m2[ cbind(as.character(d1[[1]]),as.character(d1[[2]])) ])
[1] "J" "L" "P" "I"
d1
#--------
keystr keynum val
1 abc 5 J
2 def 2 L
3 def 7 P
4 abc 3 I
Note the need to use as.character
repeatedly, because those were factor columns. Better construction would have been to build your data.frames with stringsAsFactors=FALSE
. Building the matrix will be fast and the indexing is likely to be very efficient.
You can reshape and join the data.frames using base R:
d1 <- read.table(text = 'keystr keynum
1 abc 5
2 def 2
3 def 7
4 abc 3', stringsAsFactors = FALSE)
d2 <- read.table(text = 'HD 2 3 5 7
1 abc H I J K
2 def L M N P', stringsAsFactors = FALSE, check.names = FALSE)
d2 <- reshape(d2, idvar = "HD", varying = names(d2)[-1], v.names = "val",
times = names(d2)[-1], direction = "long")
merge(d1, d2, by.x = c("keystr", "keynum"), by.y = c("HD", "time"))
#> keystr keynum val
#> 1 abc 3 I
#> 2 abc 5 J
#> 3 def 2 L
#> 4 def 7 P
I think OP
was thinking right that mapply
can provide him a direct solution. He is pretty close to a working solution with his mapply
approach. Just logic to compare for the row selection has to be corrected and then paste0
to be used for column selection from d2
.
d1$val <- mapply(function(x,y)d2[d2$HD==x,paste0("X",y)],d1$keystr, d1$keynum)
d1
# keystr keynum val
# 1 abc 5 J
# 2 def 2 L
# 3 def 7 P
# 4 abc 3 I
#
Added a check.names = False to enable data.frame column names starting with numbers. Index with a cbind()
matrix of two columns, the i, j
pairs will be extracted all at once.
d1 <- data.frame("keystr"=c("abc","def","def","abc"), "keynum"=c(5,2,7,3))
d2 <- data.frame("HD"=c("abc","def"),
"2"=c("H","L"), "3"=c("I","M"),
"5"=c("J","N"), "7"=c("K","P"), check.names=FALSE)
d1$val <- mapply(function(kstr,knum) d2[cbind(match(kstr, d1$keystr),
match(knum, names(d2)))],
d1$keystr,
d1$keynum)
keystr keynum val
1 abc 5 J
2 def 2 L
3 def 7 P
4 abc 3 I
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.