从包含间隔数据的数据集中每年对观察进行分组和计数

Question

I have data concerning the activity of a number of different writers, the data includes the start.date and end.date of their writing careers 我有一些有关不同作家活动的数据，这些数据包括他们写作生涯的开始start.date和end.date 。

library("tidyverse")
writing_period_data <- tribble(
  ~start.date, ~end.date, ~writer, ~topic,
  12, 18, "a", sample(letters[10:20],1),
  14, 20, "b", sample(letters[10:20],1),
  17, 22, "c", sample(letters[10:20],1),
  15, 30, "a", sample(letters[10:20],1)
)

I would like to ultimately create a joyplot of this data, which requires me to generate this data structure: 我想最终创建一个此数据的游戏图，这需要我生成以下数据结构：

desired_output <- tribble(
  ~year, ~count, ~writer,
  12, 1, "a",
  13, 1, "a",
  14, 1, "a",
  14, 1, "b",
  15, 2, "a",
  15, 1, "b",
  16, 2, "a",
  16, 1, "b",
  17, 2, "a",
  17, 1, "b",
  17, 1, "c",
  18, 2, "a",
  18, 1, "b",
  18, 1, "c",
  19, 1, "a",
  19, 1, "b",
  19, 1, "c",
  20, 1, "a",
  20, 1, "b",
  20, 1, "c",
  21, 1, "a",
  21, 1, "c",
  22, 1, "a",
  22, 1, "c",
  23, 1, "a",
  24, 1, "a"
)

Which we can see from this chart demonstrates the distribution of writers across the time period of interest: 我们可以从此图表中看到演示了感兴趣的时间段内作家的分布：

desired_output %>%
  ggplot(aes(x = year, y = count, fill = writer)) + geom_col()

How can I go about generating desired_output from writing_period_data ? 我该如何去产生desired_output从writing_period_data ？

Answer 1

A solution from tidyverse . tidyverse的解决方案。 dt is the final output. dt是最终输出。

library(tidyverse)

dt <- writing_period_data %>%
  mutate(year = map2(start.date, end.date, `:`)) %>%
  unnest() %>%
  count(year, writer) %>%
  select(year, count = n, writer)

从包含间隔数据的数据集中每年对观察进行分组和计数

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-07-12 14:00:29

从包含间隔数据的数据集中每年对观察进行分组和计数

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-07-12 14:00:29

解决方案1
2 已采纳 2017-07-12 14:00:29