[英]Partially Merging Data Sets in R
I have two data files that look like this: 我有两个看起来像这样的数据文件:
bin chrom chromStart chromEnd name score strand
23 chr1 119537649 119537708 A_14_P109202 1000 +
109 chr1 37879762 37879821 A_16_P15088121 1000 +
129 chr1 59113425 59113484 A_16_P00074945 1000 +
138 chr1 68288459 68288517 A_16_P00088142 1000 +
and 和
Hybridization REF TCGA-02-0001-01C-01D-0185-02
Composite Element REF normalizedLog2Ratio
A_14_P112718 0.034472223
A_16_P15000916 -0.038733669
A_16_P15001074 -0.498562753
A_16_P00000012 -0.269915751
. 。
Using the names from the first column of the second file, I need to extract additional data from the data table in the first file. 使用第二个文件的第一列中的名称,我需要从第一个文件的数据表中提取其他数据。 However, not every name in the second file appears in the first. 但是,并不是第二个文件中的每个名称都出现在第一个文件中。 I am having problems getting the files to merge properly. 我在获取文件以正确合并时遇到问题。 Any help is much appreciated. 任何帮助深表感谢。
if you place all.x=TRUE
in the merge command; 如果将all.x=TRUE
放置在merge命令中; all of the records from the first data frame will be in the merged dataframe, even if they don't have a match in the second. 第一个数据框中的所有记录都将在合并的数据框中,即使它们在第二个中没有匹配。 Is that the problem you were encountering? 那是您遇到的问题吗? In the example that you gave none of the rownames matched any of the observations in the name variable. 在该示例中,您没有给任何行名匹配name变量中的任何观察值。
bin<-c(23,109,129,138)
chrom<-c("chr1","chr1","chr1","chr1")
chromStart<-c(119537649,37879762,59113425,68288459)
name<-c("A_14_P109202", "A_16_P15088121", "A_16_P00074945","A_16_P00088142")
b<- data.frame(cbind(bin,chrom,chromStart,name))
y <- data.frame(c(0.034472223 ,-0.038733669 , -0.498562753 ,-0.269915751))
rownames(y)<-c("A_14_P112718","A_16_P15000916","A_16_P15001074","A_16_P00000012")
print(b)
print(y)
#check the rows
nrow(b)
nrow(y)
#write rownames to new variable
y$name <- rownames(y)
#conduct merge
newdataframe <- merge(b, y, by=("name"), all.x = TRUE )
#check number of rows
nrow(newdataframe)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.