R：將1個數據幀與其他兩個不同長度的數據幀進行比較

Question

我有3個長度未知的數據幀。

數據框A如下所示：

    A1  A2   n
1    1   2   1
2    3   2   2
3    2   4   3

以類似的方式，數據幀B如下所示：

    B1  B2   n
1    3   4   1
2    4   1   2
3    1   3   3

請注意，對於每行A1，A2，B1，B3都是不同的，並包含從1到4的數字。

最后，我有數據框C ：

請注意，C1的值都在0到4之間。

n列連接所有數據幀。 我想做的是檢查C1的值是否位於A數據幀或B ，以及每個n 。 並在C1中直接替換它。 如果值為0，則應保持為0。這是我期望的結果：

我該怎么做？ 感謝您的投入。

Answer 1

這是一個主意。 我們首先merge前兩個數據幀。 merge ，我們現在可以通過stack所有列（ n除外）來創建新的數據框。 通過創建這個新的數據幀（ df5在我們的例子），我們現在可以match粘貼n - value從df5與粘貼n - C1從你的第三個數據幀（ df4在我們的例子）。 然后，簡單的gsub操作僅從匹配值中提取字母。 最后，我們將NA設置為0。

df_all <- merge(df2, df3, by = 'n')
#  n A1 A2 B1 B2
#1 1  1  2  3  4
#2 2  3  2  4  1
#3 3  2  4  1  3

df5 <- data.frame(n = 1:nrow(df_all), stack(df_all[-1]), stringsAsFactors = FALSE)
#head(df5)
#  n values ind
#1 1      1  A1
#2 2      3  A1
#3 3      2  A1
#4 1      2  A2
#5 2      2  A2
#6 3      4  A2
ind <- gsub('\\d+', '', df5$ind)[match(do.call(paste, df4), do.call(paste, df5[-3]))]
ind[is.na(ind)] <- 0
ind
#[1] "B" "A" "B" "0" "A" "A" "B" "0" "B"

Answer 2

另一種略有不同的方法是，首先將A和B外部聯接都留給C ，然后找到聯接所添加的等於C1 ：

## Do the left outer joins with merge by n and all.x=TRUE
out <- merge(merge(C,A,by="n",all.x=TRUE),B,by="n",all.x=TRUE)
## Loop over rows and extract the name of the column whose value matches C1
## first define a function to do so
extract.name <- function(i,out) {
  j <- which(out$C1[i]==out[i,3:ncol(out)])
  if (length(j)==0) return("0") else return(substr(colnames(out)[j[1]+2],1,1))                       
}
## Then, apply it to all rows
out$C1 <- sapply(1:nrow(out),extract.name,out)
## Keep only the n and C1 columns as output
out <- out[,1:2]
##  n C1
##1 1  B
##2 1  A
##3 1  B
##4 2  0
##5 2  A
##6 2  A
##7 3  B
##8 3  0
##9 3  B

數據：

A <- structure(list(A1 = c(1L, 3L, 2L), A2 = c(2L, 2L, 4L), n = 1:3), .Names = c("A1", 
"A2", "n"), class = "data.frame", row.names = c("1", "2", "3"
))
##  A1 A2 n
##1  1  2 1
##2  3  2 2
##3  2  4 3

B <- structure(list(B1 = c(3L, 4L, 1L), B2 = c(4L, 1L, 3L), n = 1:3), .Names = c("B1", 
"B2", "n"), class = "data.frame", row.names = c("1", "2", "3"
))
##  B1 B2 n
##1  3  4 1
##2  4  1 2
##3  1  3 3

C <- structure(list(n = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), C1 = c(3L, 
1L, 4L, 0L, 2L, 3L, 3L, 0L, 1L)), .Names = c("n", "C1"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9"))
##  n C1
##1 1  3
##2 1  1
##3 1  4
##4 2  0
##5 2  2
##6 2  3
##7 3  3
##8 3  0
##9 3  1

R：將1個數據幀與其他兩個不同長度的數據幀進行比較

問題描述

2 個解決方案

解決方案1
2 已采納 2016-12-06 14:11:33

解決方案2
2 2016-12-06 15:01:21

R：將1個數據幀與其他兩個不同長度的數據幀進行比較

問題描述

2 個解決方案

解決方案1 2 已采納 2016-12-06 14:11:33

解決方案2 2 2016-12-06 15:01:21

解決方案1
2 已采納 2016-12-06 14:11:33

解決方案2
2 2016-12-06 15:01:21