[英]Customized merging of dataframe in R
I would like to merge the following data frame, so that each row contains the column name of the data point and the data point. 我想合并以下数据框架,以便每一行包含数据点和数据点的列名。
non.MML X2.MML X3.MML X4.MML X5.MML X6.7.MML
-13.994 NA NA NA NA NA
NA -13.992 NA NA NA NA
NA NA -13.984 NA NA NA
NA NA NA -13.983 NA NA
NA NA NA NA -13.962 NA
NA NA NA NA NA NA -13.907
NA NA -1.2 NA NA NA
NA NA NA -14.2 NA NA
NA NA NA NA -11.01 NA
NA NA NA NA NA NA -17.23
This is what I would like to get: 这就是我想要得到的:
name score
non.MML -13.994
X2.MML -13.992
X3.MML -13.984
X4.MML -13.983
X5.MML -13.962
X6.7.MML -13.907
X3.MML -1.2
X4.MML -14.2
X5.MML -11.01
X6.7.MML -17.23
I tried using this, and it gets me close to what I want: 我尝试使用它,它使我接近想要的东西:
mydata <- data.frame(x=unlist(mydata))
But I get this: 但是我得到这个:
x
non.MML1 -13.994
X2.MML1 -13.992
X3.MML1 -13.984
X4.MML1 -13.983
X5.MML1 -13.962
X6.7.MML1 -13.907
X3.MML2 -1.2
X4.MML2 -14.2
X5.MML2 -11.01
X6.7.MML2 -17.23
As you can notice the first element of each row is modified with a number because there are multiple repeats. 您会注意到,由于存在多个重复,因此每行的第一个元素都用数字进行了修改。 Whats the best way to accomplish my desired output?
实现所需输出的最佳方法是什么?
Use melt
from reshape2
: 使用来自
reshape2
melt
:
reshape2::melt(df, na.rm = TRUE, variable.name = "name", value.name = "score")
# name score
#1 non.MML -13.994
#12 X2.MML -13.992
#23 X3.MML -13.984
#27 X3.MML -1.200
#34 X4.MML -13.983
#38 X4.MML -14.200
#45 X5.MML -13.962
#49 X5.MML -11.010
#56 X6.7.MML -13.907
#60 X6.7.MML -17.230
Or use baseR stack
function: 或使用baseR
stack
功能:
setNames(na.omit(stack(df)), c("score", "name"))
# score name
#1 -13.994 non.MML
#12 -13.992 X2.MML
#23 -13.984 X3.MML
#27 -1.200 X3.MML
#34 -13.983 X4.MML
#38 -14.200 X4.MML
#45 -13.962 X5.MML
#49 -11.010 X5.MML
#56 -13.907 X6.7.MML
#60 -17.230 X6.7.MML
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.