[英]Spreading data over a date range from a column (R)
I have a set of survey data, where each survey covers multiple days. 我有一组调查数据,其中每个调查涵盖了多天。 Here is an example of what the data looks like in the current form: 这是当前形式的数据示例:
| Survey | Dates | Result |
|--------|--------------|--------|
| A | 11/30 - 12/1 | 33% |
| B | 12/2 - 12/4 | 26% |
| C | 12/4 - 12/5 | 39% |
This example can be made with the following: 可以使用以下示例制作此示例:
frame <- data.frame(Survey = c('A','B','C'),
Dates = c('11/30 - 12/1', '12/2 - 12/4', '12/4 - 12/5'),
Result = c('33%', '26%', '39%'))
What I would like to do is make a column for each date, and if the date is within the range of the survey, to put the result in the cell. 我想为每个日期创建一列,如果该日期在调查范围内,则将结果放入单元格中。 It would look something like this: 它看起来像这样:
| Survey | 11/30 | 12/1 | 12/2 | 12/3 | 12/4 | 12/5 |
|--------|-------|------|------|------|------|------|
| A | 33% | 33% | | | | |
| B | | | 26% | 26% | 26% | |
| C | | | | | 39% | 39% |
Any help would be appreciated. 任何帮助,将不胜感激。
Here's an idea: 这是一个主意:
library(dplyr)
library(tidyr)
frame %>%
separate_rows(Dates, sep = " - ") %>%
mutate(Dates = as.Date(Dates, format = "%m/%d")) %>%
group_by(Survey) %>%
complete(Dates = seq(min(Dates), max(Dates), 1)) %>%
fill(Result) %>%
spread(Dates, Result)
Which gives: 这使:
# Survey `2017-11-30` `2017-12-01` `2017-12-02` `2017-12-03` `2017-12-04` `2017-12-05`
#* <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr>
#1 A 33% 33% NA NA NA NA
#2 B NA NA 26% 26% 26% NA
#3 C NA NA NA NA 39% 39%
A tidyverse solution but it requires that you play with the Dates
column a bit: 一个简单的解决方案,但它要求您稍微玩一下“ Dates
列:
#install.packages('tidyverse')
library(tidyverse)
dframe <- data.frame(Survey = c('A','B','C'),
Dates = c('11/30 - 12/1', '12/2 - 12/4', '12/4 - 12/5'),
Result = c('33%', '26%', '39%'), stringsAsFactors = F)
dframe$Dates <- lapply(strsplit(dframe$Dates, split = " - "), function(x) {
x <- strptime(x, "%m/%d")
x <- seq(min(x), max(x), '1 day')
paste0(strftime(x, "%m/%d"), collapse = " - ")
})
dframe %>%
separate_rows(Dates, sep = " - ") %>%
spread(Dates, Result)
Should get: 应得:
Survey 11/30 12/01 12/02 12/03 12/04 12/05
A 33% 33% <NA> <NA> <NA> <NA>
B <NA> <NA> 26% 26% 26% <NA>
C <NA> <NA> <NA> <NA> 39% 39%
I hope this helps. 我希望这有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.