将数据分为R组

Question

My data frame looks like this: 我的数据框如下所示：

plant   distance
one 0
one 1
one 2
one 3
one 4
one 5
one 6
one 7
one 8
one 9
one 9.9
two 0
two 1
two 2
two 3
two 4
two 5
two 6
two 7
two 8
two 9
two 9.5

I want to split distance of each level into groups by interval(for instance,interval=3), and compute percentage of each group. 我想按时间间隔（例如，interval = 3）将每个级别的距离分成几组，然后计算每组的百分比。 Finally, plot the percentages of each level of each group similar like this: 最后，绘制各组每个级别的百分比，如下所示：

在此处输入图片说明

my code: 我的代码：

library(ggplot2)
library(dplyr)

dat <- data %>% 
  mutate(group = factor(cut(distance, seq(0, max(distance), 3), F))) %>% 
  group_by(plant, group) %>% 
  summarise(percentage = n()) %>% 
  mutate(percentage = percentage / sum(percentage))
p <- ggplot(dat, aes(x = plant, y = percentage, fill = group)) + 
  geom_bar(stat = "identity", position = "stack")+
  scale_y_continuous(labels=percent)
p

But my plot is shown below: the group 4 was missing. 但是我的图如下所示：第group 4组丢失了。 在此处输入图片说明

And I found that the dat was wrong, the group 4 was NA . 我发现dat是错的，第group 4是NA 。

在此处输入图片说明

The likely reason is that the length of group 4 was less than the interval=3 , so my question is how to fix it? 可能的原因是第group 4的长度小于interval=3 ，所以我的问题是如何解决？ Thank you in advance! 先感谢您！

Answer 1

I have solved the problem.The reason is that the cut(distance, seq(0, max(distance), 3), F) did not include the maximum and minimum values. 我已经解决了这个问题，原因是cut(distance, seq(0, max(distance), 3), F)不包括最大值和最小值。

Here is my solution: 这是我的解决方案：

dat <- my_data %>% 
  mutate(group = factor(cut(distance, seq(from = min(distance), by = 3,   length.out = n()/ 3 + 1),  include.lowest = TRUE)))  %>% 
  count(plant, group) %>%
  group_by(plant) %>%
  mutate(percentage = n / sum(n))

将数据分为R组

问题描述

1 个解决方案

解决方案1
0 2015-04-01 03:51:12

将数据分为R组

问题描述

1 个解决方案

解决方案1 0 2015-04-01 03:51:12

解决方案1
0 2015-04-01 03:51:12