简体   繁体   English

R中的boxplot,美学必须是长度1或与数据相同的长度

[英]boxplot in R, aesthetics must be either length 1 or the same length as data

I'm doing some analysis on the auto miles per gallon data from UCI website: 我正在对UCI网站每加仑汽车数据进行一些分析:

https://archive.ics.uci.edu/ml/datasets/Auto+MPG https://archive.ics.uci.edu/ml/datasets/Auto+MPG

I factored the first column into either high or low mileage: 我将第一列考虑到高或低里程:

mpg01 = I(auto1$mpg >= median(auto1$mpg))
Auto = data.frame(mpg01, auto1[,-1])
head(Auto)

    mpg01 cylinders displacement horsepower weight acceleration year origin
 1 FALSE         8          307        130   3504         12.0   70      1
 2 FALSE         8          350        165   3693         11.5   70      1
 3 FALSE         8          318        150   3436         11.0   70      1
 4 FALSE         8          304        150   3433         12.0   70      1
 5 FALSE         8          302        140   3449         10.5   70      1
 6 FALSE         8          429        198   4341         10.0   70      1

Now I want to make boxplot for each of the columns from dataframe, factored by the first column. 现在我想为数据框中的每个列制作一个boxplot,由第一列计算。

vars <- c("cylinders", "displacement", "horsepower", "weight", "acceleration", "year", "origin")
ggplot(Auto) + geom_bar(aes(y=vars, fill=factor(mpg01)))

And I get the error "Aes must be either length 1 or the same as data" 我得到错误“Aes必须是长度1或与数据相同”

The dimension of "Auto" dataframe is 392x8 “自动”数据帧的维度为392x8

I can just use boxplot for each column, but want to know if there's a way to combine them into one. 我可以为每列使用boxplot,但想知道是否有办法将它们合并为一个。 Thanks! 谢谢!

Updated to explain the error generated: The error is generated because aes(x, y...) need to be defined to describe how the data frame variables should be mapped into the geoms. 更新以解释生成的错误:生成错误是因为需要定义aes(x, y...)来描述数据框变量应如何映射到geoms中。 In your case, no x variable has been defined for geom_boxplot . 在您的情况下,没有为geom_boxplot定义x变量。 In order to define the x variable to be each of the columns of your df, the df needs to be reshaped to long format (eg using reshape2::melt or tidyr::gather ) 为了将x变量定义为df的每个列,需要将df重新整形为长格式(例如,使用reshape2::melttidyr::gather

Below is the solution that should work which is based on mtcars and not your data. 以下是应该工作的解决方案,它基于mtcars而不是您的数据。 If not, we can troubleshoot it once once you dput(Auto) for me. 如果没有,我们可以为您dput(Auto)一次后dput(Auto)进行故障排除。 The plot you generate should look like the one I attached. 您生成的图表应该与我附加的图表类似。 First, reshape your data. 首先,重塑您的数据。

library(reshape2)
library(ggplot2)
mtcars_melt <- melt(mtcars)

I can now define x in aes . 我现在可以在aes定义x Note: Notice the difference between the 2 cases below when used with facet_wrap . 注意:注意与facet_wrap使用时,下面两种情况之间的facet_wrap

# First with no facet_wrap
ggplot(mtcars_melt, aes(x=variable, y=value, fill=variable)) + geom_boxplot()
# Case 1 with facet_wrap
ggplot(mtcars_melt, aes(x=variable, y=value, fill=variable)) + geom_boxplot() + facet_wrap(~variable)
# Case 2 with facet_wrap
ggplot(mtcars_melt, aes(x="", y=value, fill=variable)) + geom_boxplot() + facet_wrap(~variable)

In case 1, I define x=variable in aes , but with facet_wrap it forces each facet to have all x variables present, however if I set x="" , it allows for each facet to hold only 1 x variable. 在情况1中,我在aes定义x=variable ,但是使用facet_wrap会强制每个facet都存在所有x个变量,但是如果我设置x="" ,它允许每个facet只保存1个x变量。

Now to allow the y-axis to have independent scales, I can set scales="free_y" 现在允许y轴有独立的比例,我可以设置scales="free_y"

ggplot(mtcars_melt, aes(x="", y=value, fill=variable)) + geom_boxplot() + facet_wrap(~variable, scales="free_y")

Alternatively, I can set scales="free" to apply to both x and y axis and use it with x=variable to arrive at a similar solution. 或者,我可以设置scales="free"以应用于x和y轴,并将其与x=variable一起使用以获得类似的解决方案。

ggplot(mtcars_melt, aes(x=variable, y=value, fill=variable)) + geom_boxplot() + facet_wrap(~variable, scales="free")

在此输入图像描述

Edited: The code below should work for your particular data set: 编辑:以下代码适用于您的特定数据集:

library(reshape2)
library(ggplot2)
vars <- c("cylinders", "displacement", "horsepower", "weight", "acceleration", "year", "origin")
Auto_melt <- melt(Auto[, vars])
ggplot(Auto_melt, aes(x="", y=value, fill=variable)) + geom_boxplot() + facet_wrap(~variable, scales="free_y")

Edited with code to separate by mpg as requested: Redefine vars by including "mpg01", and melt the data by mpg id. 编辑代码按要求分隔mpg:通过包含“mpg01”重新定义变量,并通过mpg id融化数据。 Use mpg01 as aes x value. 使用mpg01作为aes x值。

Auto <- structure(list(mpg01 = structure(c(2L, 1L, 1L, 1L, 1L), .Label = c("FALSE", "TRUE"), class = "factor"), cylinders = c(8L, 8L, 8L, 8L, 8L), displacement = c(307, 350, 318, 304, 302), horsepower = c(130L, 165L, 150L, 150L, 140L), weight = c(3504L, 3693L, 3436L, 3433L, 3449L), acceleration = c(12, 11.5, 11, 12, 10.5), year = c(70L, 70L, 70L, 70L, 70L), origin = c(1L, 1L, 1L, 1L, 1L)), .Names = c("mpg01", "cylinders", "displacement", "horsepower", "weight", "acceleration", "year", "origin"), row.names = c(NA, 5L), class = "data.frame") 

vars <- c("mpg01", "cylinders", "displacement", "horsepower", "weight", "acceleration", "year", "origin")
Auto_melt <- melt(Auto[, vars], id.vars="mpg01")
ggplot(Auto_melt, aes(x=mpg01, y=value, fill=variable)) + geom_boxplot() +    facet_wrap(~variable, scales="free_y")

在此输入图像描述

I think maybe you should tidy your data, then to draw boxplot. 我想也许你应该整理你的数据,然后绘制boxplot。 I download the data from the website : 我从网站上下载数据:

> head(df)
  mpg01 cylinders displacement horsepower weight acceleration year origin
1    18         8          307        130   3504         12.0   70      1
2    15         8          350        165   3693         11.5   70      1
3    18         8          318        150   3436         11.0   70      1
4    16         8          304        150   3433         12.0   70      1
5    17         8          302        140   3449         10.5   70      1
6    15         8          429        198   4341         10.0   70      1

Use gather{tidyr} to tidy data. 使用gather {tidyr}来整理数据。

library("tidyr")
library("dplyr")
library("ggplot2")
tidy_df <- df %>% gather("vars","values",-mpg01)

And tidy_df is: 并且tidy_df是:

> head(tidy_df)
  mpg01      vars values
1    18 cylinders      8
2    15 cylinders      8
3    18 cylinders      8
4    16 cylinders      8
5    17 cylinders      8
6    15 cylinders      8

Then you can draw boxplot 然后你可以绘制boxplot

ggplot(data=tidy_df,aes(vars,values)) + geom_boxplot(aes(fill=vars))

It looks like that: 它看起来像这样: 在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM