简体   繁体   English

如何使用reshape2包中的melt()来堆叠数据的分类标签以产生多个并排的boxpot

[英]How to use melt() from the reshape2 package to stack categorical labels of data to produce multiple side-by-side boxpots

I am trying to use the melt() function from the “reshape2” package in R to stack a dataframe while keeping categorical labels for the individual observations. 我正在尝试使用R中“reshape2”包中的melt() function来堆叠数据“reshape2” ,同时保留用于各个观察的分类标签。 My question is how do I adapt Eric Cai's code Code to produce multiple side-by-side notched boxplots at the level of behaviours$Family (a 2 level factor column) grouped by each behavioural variable for the data-set called behviours (a link to the Dummy data is supplied below)? 我的问题是,我如何适应Eric Cai's code Code,以在行为级别$ family(2级因子列)的行为级别上生成多个并排的带槽的箱线图,该行为由称为行为的数据集的每个行为变量分组(链接虚拟数据在下面提供)?

My aim is to colour code these multiple notched boxplots for each family (V4=red and W3 = blue) with a legend. 我的目标是用图例为每个系列(V4=red and W3 = blue)为这些多个凹口箱形图着色。 However, I am encountering an issue with dimensions when trying to arrange the dataframe using the melt() function, from which I am having trouble deciphering. 但是,在尝试使用melt()函数排列数据框时遇到尺寸问题,我无法从中解译。 If anyone can help then many thanks in advance. 如果有人可以提供帮助,则在此先多谢。

The reproducible dummy data is found at bottom of a stack overflow page Reproducible data 可重复的虚拟数据是在一个堆栈溢出页面底部发现重复性数据

 Here is an example:

 I am trying to follow Eric Cai's instructions
 (1) Stack the data:
     (a) Retain the categorical (2 level factor column) for family [,1]
     (b) Retain all behavioural variables [,2:13]

  #Set vectors for labelling the data

                      behaviours.label=c("Swimming", 
                                         "Not.Swimming",
                                         "Running", 
                                         "Not.Running",
                                         "Fighting",
                                         "Not.Fighting",
                                         "Resting",
                                         "Not.Resting",
                                         "Hunting",
                                         "Not.Hunting",
                                         "Grooming",
                                         "Not.Grooming")

                         family.labels=c("V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8",
                                         "V4", "G8")

    library(tidyr)                        
    data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
    head(data_long)  

    # stack the data while retaining the Family and behavioural variables 

    stacked.data = melt(behaviours, id = c('Family', 'behaviours'))

    # remove the column that gives the column name variable
    stacked.data = stacked.data[, -3]

    #head(stacked.data)
    colnames(stacked.data)<-c("Family", "Behaviours", "Values")

Generating the Box Plots 生成箱线图

Generate an object called boxplots.double, which will use the formula text{Mean.value ~ Family + Behaviours} to separate the plots into 12 groups of doublets (ie each behaviour will be grouped at the level of behaviours$family in a single plot). 生成一个名为boxplots.double的对象,该对象将使用公式文本{Mean.value〜Family + Behaviours}将地块分成12组doublet组(即,每个行为将在单个地块的行为$ family级别进行分组)。 In Eric Cai's code “at = ” is an option to specify the locations of the box plots along the horizontal axis, and xaxt = 'n' suppresses the default horizontal axis which adds custom axis with the axis() and title() 在Eric Cai的代码中,“ at =”是一个选项,用于指定沿水平轴的箱形图的位置,而xaxt ='n'禁止使用默认水平轴,该默认水平轴将自定义轴与axis()和title()一起添加

   boxplots.double = boxplot(values~Family + Behaviours, 
                             data = stacked.data, 
                             at = c(1:24), 
                             xaxt='n',
                             ylim = c(min(0, min(-3)), 
                             max(7, na.rm = T)),
                             notch=TRUE,
                             col = c("red", "blue"),
                             names = c("V4", "G8"),
                             cex.axis=1.0,
                             srt=45)

  axis(side=1, at=c(1.8, 6.8), labels=c("Swimming", 
                                       "Not.Swimming",
                                       "Running", 
                                       "Not.Running",
                                       "Fighting",
                                       "Not.Fighting",
                                       "Resting",
                                       "Not.Resting",
                                       "Hunting",
                                       "Not.Hunting",
                                       "Grooming",
                                       "Not.Grooming"), line=0.5, lwd=0)

Error message 错误信息

   Error in axis(side = 1, at = 1:24, labels = c("V4", "G8"), xaxt = "n",     : 
  'at' and 'labels' lengths differ, 24 != 2
  In addition: Warning message:
  In bxp(list(stats = c(-1.20186549488911, -0.970033304559564,   -0.465271399251147,  :
  some notches went outside hinges ('box'): maybe set notch=FALSE

After Richard Telford kindly offered to help, this code produces multiple side-by-side boxplots grouped at the level of the categorical column (2 levels) called Family using the melt() function contained in the package reshape2 理查德·特尔福德(Richard Telford)慷慨地提供帮助之后,此代码使用reshape2包中包含的melt() function生成了多个分类箱图,这些箱图被归类为称为Family的分类列(2个级别)。

   clear the working directory
   rm(list=ls())

   data(behaviours)

   #Set vectors for labelling the data

   behaviours.labels=c("Swimming",  
                       "Not.Swimming",
                       "Running", 
                       "Not.Running",
                       "Fighting",
                       "Not.Fighting",
                       "Resting",
                       "Not.Resting",
                       "Hunting",
                       "Not.Hunting",
                       "Grooming",
                       "Not.Grooming")

       family.labels=c("V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8",
                       "V4", "G8")

      library(tidyr)

      #Structure the data from wide to long format 

      data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
      head(data_long)    

   library(reshape2)

   # stack the data while retaining Family and Values calculated from behaviours[,2:13] using the melt() function

   stacked.data = melt(data_long, id = c('Family', 'x'))
   head(stacked.data)

   # remove the column that gives the column name of the `variable' from all.data

   stacked.data = stacked.data[, -3]
   head(stacked.data)

   #Rename the column headings

   colnames(stacked.data)<-c("Family", "Behaviours", "Values")    

   #Generate the side-by-side boxplots

   windows(height=10, width=14)
   par(mar = c(9, 7, 4, 4)+0.3, mgp=c(5, 1.5, 0))

   boxplots.double = boxplot(Values~Family + Behaviours, 
                             data = stacked.data, 
                             at = c(1:24), 
                             ylim = c(min(0, min(0)), 
                                      max(1.8, na.rm = T)),
                             xaxt = "n",
                             notch=TRUE,
                             col = c("red", "blue"),
                             cex.axis=0.7,
                             cex.labels=0.7,
                             ylab="Values", 
                             xlab="Behaviours",
                             space=1)

   axis(side = 1, at = seq(2, 24, by = 2), labels = FALSE)
   text(seq(2, 24, by=2), par("usr")[3] - 0.2, labels=unique(behaviours.labels), srt = 45, pos = 1, xpd = TRUE, cex=0.8)
   legend("topright", title = "Family", cex=1.0, legend=c("V4" , "G8"), fill=c("Blue", "Red"), lty = c(1,1))

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM