簡體   English   中英

R:根據匹配表匯總列,合並列名稱

[英]R: Sum columns based on match table, merging column names

我有一個這樣的數據框(字母是列名):

a   b   c   B   C   A
1   2   3   6   7   8
1   2   3   6   7   8
1   2   3   6   7   8
1   2   3   6   7   8

我想根據此匹配表對各列求和:

a   A
b   B
c   C

同時還合並了列名稱,因此結果將是:

a/A b/B c/C
9   8   10
9   8   10
9   8   10
9   8   10

請記住,該解決方案必須適用於大型數據框,因此我無法手動指定新的列名。

非常感謝!

您可以這樣做:

res <- apply(df.match, 1, function(x) rowSums(df[,c(x[1], x[2])]))
colnames(res) <- paste0(df.match[,1], "/", df.match[,2])

#     a/A    b/B   c/C
#[1,]    9    8   10
#[2,]    9    8   10
#[3,]    9    8   10
#[4,]    9    8   10
#[5,]    9    8   10

其中df是您的數據幀,而df.match是您匹配的列名。

基本上涉及使用match兩次。 使用@Lyngbakr的數據。

#DATA
df = structure(list(a = c(1, 1, 1, 1, 1), b = c(2, 2, 2, 2, 2), c = c(3, 
3, 3, 3, 3), A = c(8, 8, 8, 8, 8), C = c(7, 7, 7, 7, 7), B = c(6, 
6, 6, 6, 6)), .Names = c("a", "b", "c", "A", "C", "B"), row.names = c(NA, 
-5L), class = "data.frame")
df.names = structure(list(First = c("a", "b", "c"), Second = c("A", "B", 
"C")), .Names = c("First", "Second"), row.names = c(NA, -3L), class = "data.frame")

toadd = which(colnames(df) %in% df.names[,1])
addto = match(df.names[,2][match(colnames(df)[toadd], df.names[,1])], colnames(df))
setNames(object = df[,addto] + df[,toadd], nm = paste(colnames(df)[toadd], colnames(df)[addto], sep = "/"))
#  a/A b/B c/C
#1   9   8  10
#2   9   8  10
#3   9   8  10
#4   9   8  10
#5   9   8  10

這是一種實現方法...

df <- data.frame(a=c(1,1,1,1),b=c(2,2,2,2),c=c(3,3,3,3),B=c(6,6,6,6),C=c(7,7,7,7),A=c(8,8,8,8))
matchtab <- data.frame(V1=c("a","b","c"),V2=c("A","B","C"),stringsAsFactors = FALSE)

df2 <- do.call(cbind,lapply(seq_len(nrow(matchtab)),function(i) 
                           data.frame(df[,matchtab$V1[i]]+df[,matchtab$V2[i]])))
names(df2) <- paste0(matchtab$V1,"/",matchtab$V2)

df2
  a/A b/B c/C
1   9   8  10
2   9   8  10
3   9   8  10
4   9   8  10

像這樣嗎

df <- data.frame(a = rep(1, 5), b = rep(2, 5), c = rep(3, 5), 
                 A = rep(8, 5), C = rep(7, 5), B = rep(6, 5))

df.names <- data.frame(First = c("a", "b", "c"), Second =  c("A", "B", "C"))

apply(df.names, MAR = 1, FUN = function(mynames, mydf) rowSums(df[,colnames(df) %in% mynames]), mydf = df)

給,

     [,1] [,2] [,3]
[1,]    9    8   10
[2,]    9    8   10
[3,]    9    8   10
[4,]    9    8   10
[5,]    9    8   10

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM