简体   繁体   English

在R中的boxplot中均匀分布数据点(使用ggplot2)

[英]Evenly distribute data points in boxplot in R (using ggplot2)

I have a problem with spacing data points in a boxplot. 我在箱线图中间距数据点时遇到问题。 I use the following code. 我使用以下代码。

DF1 <- data.frame(x = c(1, 2, 3, 4, 7, 11, 20, 23, 24, 25, 30), y = c(3, 6, 12, 13, 17, 22, NA, NA, NA, NA, NA))
library(ggplot2)
library(tidyverse)
n <- 11
DF1 <- as.data.frame(DF1)
DF1 <- reshape2::melt(DF1)
DF1 %>%
  group_by(variable) %>%
  arrange(value) %>%
  mutate(xcoord = seq(-0.25, 0.25, length.out = n())) %>%
  ggplot(aes(x = variable, y = value, group = variable)) +
  geom_boxplot() +
  geom_point(aes(x = xcoord + as.integer(variable)))

This results in the following: 结果如下:

R boxplot ggplot2

For x, all data points are evenly distributed left to right, but since y has fewer data points, they are not evenly distributed left to right. 对于x,所有数据点从左到右均匀分布,但是由于y的数据点较少,因此它们并不是从左到右均匀分布。 How can the above code be modified to evenly space out data points for y too? 怎样修改上面的代码以使y的数据点也均匀分布? I would appreciate any suggestions. 我将不胜感激任何建议。

I found a somewhat similar post here , but that could not help me. 我在这里找到了一个类似的帖子,但这无济于事。

Thank you. 谢谢。

The problem is the NA values in y . 问题是yNA值。 After you go to long format, you can simply omit them: 使用长格式后,您可以简单地省略它们:

plot_data = DF1 %>%
  na.omit %>%  ## add this here
  group_by(variable) %>%
  arrange(value) %>%
  mutate(xcoord = seq(-0.25, 0.25, length.out = n()))

ggplot(plot_data, aes(x = variable, y = value, group = variable)) +
  geom_boxplot() +
  geom_point(aes(x = xcoord + as.integer(variable)))

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM