简体   繁体   English

dataframe 的数据广播和重新排序

[英]dcast and reorder data of a dataframe

I have the following dataframe:我有以下 dataframe:

MOUSE_GENOTYPE  GENE_ID INFLAMMATION_TYPE   Fold gene expression

WT  Arg1                  microglia        0.02581
WT Arg1                  microglia        0.06783
KO  Arg1                  microglia        0.01477
KO Arg1               microglia        0.01787            
WT  Aspg                  PAN-reactive     0.01856
WT Aspg               PAN-reactive     0.08373
KO  Aspg                  PAN-reactive     0.0199
KO Aspg               PAN-reactive     0.09839
WT  Emp1                  A2-specific      0.03525
WT Emp1               A2-specific      0.02738
KO  Emp1                  A2-specific      0.01627
KO Emp1               A2-specific      0.02832

I'd like to reshape it by aggregating the mean and like the following:我想通过聚合平均值来重塑它,如下所示:

                Arg1        Aspg                Emp1
MOUSE_GENOTYPE  microglia   PAN-reactive        A2-specific
WT              0.02581      0.0185             0.00691
KO              0.01477      0.0199             0.00631

I'd like to re-order the columns by the variable INFLAMMATION_TYPE so I can get PAN-reactive columns first, then microglia and then A2-specific.我想通过变量 INFLAMMATION_TYPE 对列重新排序,这样我就可以先获得 PAN 反应列,然后是小胶质细胞,然后是 A2 特定的列。

                Aspg         Arg1                Emp1
MOUSE_GENOTYPE  PAN-reactive microglia          A2-specific
WT               0.0185      0.02581            0.00691
KO               0.0199      0.0147             0.00631

I have tried the dcast function:我试过 dcast function:

results_reshaped <- dcast(results, 
                          MOUSE_GENOTYPE  ~ GENE_ID + INFLAMMATION_TYPE,
                          fun.aggregate = mean)

but the GENE_ID and INFLAMMATION TYPE get combined in one variable:但是 GENE_ID 和 INFLAMMATION TYPE 组合在一个变量中:

                         Arg1_ microglia     Aspg_ PAN-reactive            Emp1_ A2-specific
MOUSE_GENOTYPE  
WT                              0.02581  0.01856                                      0.00691
KO                              0.01477  0.0199                                      0.00631

I would like to keep them in separate columns and re-order them to get this.我想将它们保存在单独的列中并重新排序以获得此信息。 Suggestions?建议?

Thanks!谢谢! P

You can format names of your dataframe as next in order to reach a similar structure as you want:您可以将 dataframe 的名称格式化为下一个,以便达到您想要的类似结构:

#Format names
vecoriginal <- names(results_reshaped)
vecnames <- names(results_reshaped)[-1]
v1 <- c("",gsub("_.*", "",vecnames))
v2 <- c(vecoriginal[1],gsub(".*_","",vecoriginal[-1]))
#Assign names
names(results_reshaped)<-v1
#Create empty dataframe
empty <- as.data.frame(t(data.frame(v1,v2,stringsAsFactors = F)),stringsAsFactors = F)
names(empty)<-names(results_reshaped)
#Bind elements
DF <- rbind(empty,results_reshaped)
DF <- DF[-1,]
rownames(DF)<-NULL

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM