在R数据帧中合并具有重叠数据的列

Question

a<-data.frame(cbind("Sample"=c("100","101","102","103"),"Status"=c("Y","","","partial")))
b<-data.frame(cbind("Sample"=c("100","101","102","103","106"),"Status"=c("NA","Y","","","Y")))

desired<-data.frame(cbind("Sample"=c("100","101","102","103","106"),"Status"=c("Y","Y","","partial","Y")))

我有多个来源的样本处理数据，我想将它们合并到一个主列表中。 如何合并2个数据帧之间的“状态”列，以便覆盖b以便为每个样本整理“ Y”和“部分”？ 先感谢您。

Answer 1

我假设您要按优先级顺序保留a和b中的值，Y覆盖部分内容，而NA则覆盖任何内容。

d <- merge(a,b,by="Sample",all=TRUE)
d$Status <- ""
d$Status[apply(c,1,function(x){any(is.na(x))})] <- "" # cleaning the NAs I introduced with the merge
d$Status[apply(c,1,`%in%`, x = "NA")] <- NA # or "NA" if you want to keep it this way, or "" if you want to get rid of them
d$Status[apply(c,1,`%in%`, x = "partial")] <- "partial"
d$Status[apply(c,1,`%in%`, x = "Y")] <- "Y"
d <- d[,c(1,4)]

# Sample  Status
# 1    100       Y
# 2    101       Y
# 3    102        
# 4    103 partial
# 5    106       Y

Answer 2

require(data.table)    

a<-data.table(cbind("Sample"=c("100","101","102","103"),"Status"=c("Y","","","partial")))
b<-data.table("Sample"=c("100","101","102","103","106"),"Status"=c("NA","Y","","","Y"))

c <- merge(a, b, by = "Sample", all=TRUE)
c[,Status := ifelse(!is.na(Status.x), Status.x, Status.y)]
c[,`:=` (Status.x=NULL, Status.y = NULL)]

在R数据帧中合并具有重叠数据的列

问题描述

2 个解决方案

解决方案1
1 2017-06-07 15:59:15

解决方案2
1 已采纳 2017-06-07 16:00:03

在R数据帧中合并具有重叠数据的列

问题描述

2 个解决方案

解决方案1 1 2017-06-07 15:59:15

解决方案2 1 已采纳 2017-06-07 16:00:03

解决方案1
1 2017-06-07 15:59:15

解决方案2
1 已采纳 2017-06-07 16:00:03