[英]How to use melt() from the reshape2 package to stack categorical labels of data to produce multiple side-by-side boxpots
I am trying to use the melt() function
from the “reshape2”
package in R to stack a dataframe while keeping categorical labels for the individual observations. 我正在尝试使用R中
“reshape2”
包中的melt() function
来堆叠数据“reshape2”
,同时保留用于各个观察的分类标签。 My question is how do I adapt Eric Cai's code
Code to produce multiple side-by-side notched boxplots at the level of behaviours$Family (a 2 level factor column) grouped by each behavioural variable for the data-set called behviours (a link to the Dummy data is supplied below)? 我的问题是,我如何适应
Eric Cai's code
Code,以在行为级别$ family(2级因子列)的行为级别上生成多个并排的带槽的箱线图,该行为由称为行为的数据集的每个行为变量分组(链接虚拟数据在下面提供)?
My aim is to colour code these multiple notched boxplots for each family (V4=red and W3 = blue)
with a legend. 我的目标是用图例为每个系列
(V4=red and W3 = blue)
为这些多个凹口箱形图着色。 However, I am encountering an issue with dimensions when trying to arrange the dataframe using the melt()
function, from which I am having trouble deciphering. 但是,在尝试使用
melt()
函数排列数据框时遇到尺寸问题,我无法从中解译。 If anyone can help then many thanks in advance. 如果有人可以提供帮助,则在此先多谢。
The reproducible dummy data is found at bottom of a stack overflow page Reproducible data 可重复的虚拟数据是在一个堆栈溢出页面底部发现重复性数据
Here is an example:
I am trying to follow Eric Cai's instructions
(1) Stack the data:
(a) Retain the categorical (2 level factor column) for family [,1]
(b) Retain all behavioural variables [,2:13]
#Set vectors for labelling the data
behaviours.label=c("Swimming",
"Not.Swimming",
"Running",
"Not.Running",
"Fighting",
"Not.Fighting",
"Resting",
"Not.Resting",
"Hunting",
"Not.Hunting",
"Grooming",
"Not.Grooming")
family.labels=c("V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8")
library(tidyr)
data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
head(data_long)
# stack the data while retaining the Family and behavioural variables
stacked.data = melt(behaviours, id = c('Family', 'behaviours'))
# remove the column that gives the column name variable
stacked.data = stacked.data[, -3]
#head(stacked.data)
colnames(stacked.data)<-c("Family", "Behaviours", "Values")
Generate an object called boxplots.double, which will use the formula text{Mean.value ~ Family + Behaviours} to separate the plots into 12 groups of doublets (ie each behaviour will be grouped at the level of behaviours$family in a single plot). 生成一个名为boxplots.double的对象,该对象将使用公式文本{Mean.value〜Family + Behaviours}将地块分成12组doublet组(即,每个行为将在单个地块的行为$ family级别进行分组)。 In Eric Cai's code “at = ” is an option to specify the locations of the box plots along the horizontal axis, and xaxt = 'n' suppresses the default horizontal axis which adds custom axis with the axis() and title()
在Eric Cai的代码中,“ at =”是一个选项,用于指定沿水平轴的箱形图的位置,而xaxt ='n'禁止使用默认水平轴,该默认水平轴将自定义轴与axis()和title()一起添加
boxplots.double = boxplot(values~Family + Behaviours,
data = stacked.data,
at = c(1:24),
xaxt='n',
ylim = c(min(0, min(-3)),
max(7, na.rm = T)),
notch=TRUE,
col = c("red", "blue"),
names = c("V4", "G8"),
cex.axis=1.0,
srt=45)
axis(side=1, at=c(1.8, 6.8), labels=c("Swimming",
"Not.Swimming",
"Running",
"Not.Running",
"Fighting",
"Not.Fighting",
"Resting",
"Not.Resting",
"Hunting",
"Not.Hunting",
"Grooming",
"Not.Grooming"), line=0.5, lwd=0)
Error in axis(side = 1, at = 1:24, labels = c("V4", "G8"), xaxt = "n", :
'at' and 'labels' lengths differ, 24 != 2
In addition: Warning message:
In bxp(list(stats = c(-1.20186549488911, -0.970033304559564, -0.465271399251147, :
some notches went outside hinges ('box'): maybe set notch=FALSE
After Richard Telford kindly offered to help, this code produces multiple side-by-side boxplots grouped at the level of the categorical column (2 levels) called Family
using the melt() function
contained in the package reshape2
理查德·特尔福德(Richard Telford)慷慨地提供帮助之后,此代码使用
reshape2
包中包含的melt() function
生成了多个分类箱图,这些箱图被归类为称为Family
的分类列(2个级别)。
clear the working directory
rm(list=ls())
data(behaviours)
#Set vectors for labelling the data
behaviours.labels=c("Swimming",
"Not.Swimming",
"Running",
"Not.Running",
"Fighting",
"Not.Fighting",
"Resting",
"Not.Resting",
"Hunting",
"Not.Hunting",
"Grooming",
"Not.Grooming")
family.labels=c("V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8")
library(tidyr)
#Structure the data from wide to long format
data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
head(data_long)
library(reshape2)
# stack the data while retaining Family and Values calculated from behaviours[,2:13] using the melt() function
stacked.data = melt(data_long, id = c('Family', 'x'))
head(stacked.data)
# remove the column that gives the column name of the `variable' from all.data
stacked.data = stacked.data[, -3]
head(stacked.data)
#Rename the column headings
colnames(stacked.data)<-c("Family", "Behaviours", "Values")
#Generate the side-by-side boxplots
windows(height=10, width=14)
par(mar = c(9, 7, 4, 4)+0.3, mgp=c(5, 1.5, 0))
boxplots.double = boxplot(Values~Family + Behaviours,
data = stacked.data,
at = c(1:24),
ylim = c(min(0, min(0)),
max(1.8, na.rm = T)),
xaxt = "n",
notch=TRUE,
col = c("red", "blue"),
cex.axis=0.7,
cex.labels=0.7,
ylab="Values",
xlab="Behaviours",
space=1)
axis(side = 1, at = seq(2, 24, by = 2), labels = FALSE)
text(seq(2, 24, by=2), par("usr")[3] - 0.2, labels=unique(behaviours.labels), srt = 45, pos = 1, xpd = TRUE, cex=0.8)
legend("topright", title = "Family", cex=1.0, legend=c("V4" , "G8"), fill=c("Blue", "Red"), lty = c(1,1))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.