简体   繁体   English

如何在R中使用vioplot绘制分类数据与数值数据?

[英]How to plot categorical vs numerical data with vioplot in R?

I have a data set with a categorical column "colors", which has 4 colors. 我有一个带有类别列“颜色”的数据集,该列具有4种颜色。 One of the other 2 columns is quantitative, and is called "pollen". 其他2列之一是定量的,被称为“花粉”。 I am trying to get vioplot to make 4 separate violin plots color vs pollen. 我正在尝试让vioplot制作4个单独的小提琴图,分别是颜色和花粉。 Here is a data sample 这是一个数据样本

在此处输入图片说明

The data is available at http://www.uwyo.edu/crawford/datasets/beeflowers.txt 数据可在http://www.uwyo.edu/crawford/datasets/beeflowers.txt获得

I made 4 subsets of the data with 我用4个数据子集

blue <- subset(beeflowers4, colors=="blue", select=c(pollen, colors))
green <- subset(beeflowers4, colors=="green", select=c(pollen, colors))
purple <- subset(beeflowers4, colors=="pruple", select=c(pollen, colors))
red <- subset(beeflowers4, colors=="red", select=c(pollen, colors))

I then tried to plot a violin plot with 然后我尝试用

vioplot(blue, green, purple, red, names=c("blue", "green", "purple", "red"), col="yellow")

However I got this error 但是我得到了这个错误

#Error in FUN(X[[1L]], ...) : 
#  only defined on a data frame with all numeric variables

Is there anyway for vioplot to plot pollen vs colors? 无论如何,vioplot是否可以绘制花粉与颜色?

Here's another way that is much less repetitive. 这是另一种不太重复的方式。 When you find yourself typing the same thing over and over, like those four lines of subsets, it's a sign that there's a more efficient way. 当您发现自己一遍又一遍地键入相同的内容时,例如这四行子集,这表明存在一种更有效的方法。

In this case, ggplot takes data in the long form that you already have, so there's no need for any sub-setting or reshaping. 在这种情况下, ggplot会以您已经拥有的长格式获取数据,因此不需要任何子设置或重塑。

# import data
x <- read.table("http://www.uwyo.edu/crawford/datasets/beeflowers.txt", 
                stringsAsFactors = FALSE,
                header = TRUE)

# inspect
str(x); View(x)

# get rid of that 999, presumably missing data
x <- x[x$pollen != 999, ]

# plot
library(ggplot2)
ggplot(x, aes(colors, pollen)) +
  geom_violin()

在此处输入图片说明

You misspelled "purple" when you subsetted. 子集化时,您拼错了“紫色”字样。 Also, in the vioplot function, your first four arguments need to be vectors, not data frames. 另外,在vioplot函数中,您的前四个参数必须是向量,而不是数据帧。 This code should work. 此代码应该起作用。

blue <- subset(beeflowers4, colors=="blue", select=c(pollen, colors))

green <- subset(beeflowers4, colors=="green", select=c(pollen, colors))

purple <- subset(beeflowers4, colors=="purple", select=c(pollen, colors))

red <- subset(beeflowers4, colors=="red", select=c(pollen, colors))

vioplot(blue$pollen, green$pollen, purple$pollen, red$pollen, names=c("blue", "green", "purple", "red"), col="yellow")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM