按用户和特定日期合并时间序列数据

Question

我有一个如下表所示的数据框。 它是用户的时间序列。

用户	日期	年龄	情绪评分
一个	9.19	20	1
一个	11.20	20	2
一个	12.10	20	3
b	9.30	19	1
b	10.1	19	4
c	12.1	21	5

我希望生成一个像这样的表。 Trail 1表示某个日期（例如 11 月 7 日）之前的平均情绪得分。 Trail 2表示某个日期（例如 11 月 7 日）之后的平均情绪得分。

User Age trial    Mean Sentiment Score
a    20  1          1-->(mean SentimentScore before 11.7)
a    20  2          2.5 -->(mean SentimentScoree after 11.7)
b    19  1          2.5--->(mean SentimentScoree before 11.7)
c    21  1          NA --->(mean SentimentScoree before 11.7)

Answer 1

library(data.table)

dt[, trial := fcase(Date <= as.Date("2021-11-07"), 1,
                    Date >  as.Date("2021-11-07"), 2)]

dt[,.( Mean.Sentiment.Score = mean(SentimentScore) ),
   by = .(User,Age,trial)]

结果：

   User Age trial Mean.Sentiment.Score
1:    a  20     1                  1.0
2:    a  20     2                  2.5
3:    b  19     1                  2.5
4:    c  21     2                  5.0

数据（我手动输入，您应该在问题中提供dput ）：

library(data.table)
dt <- data.table(
    User = c("a", "a", "a", "b", "b", "c"),
    Date = as.Date(c("2021-09-19", "2021-11-20", "2021-12-10", "2021-09-30",
                     "2021-10-01", "2021-12-01")),
    Age = c(20, 20, 20, 19, 19, 21),
    SentimentScore = c(1, 2, 3, 1, 4, 5)
)
dt
#>    User       Date Age SentimentScore
#> 1:    a 2021-09-19  20              1
#> 2:    a 2021-11-20  20              2
#> 3:    a 2021-12-10  20              3
#> 4:    b 2021-09-30  19              1
#> 5:    b 2021-10-01  19              4
#> 6:    c 2021-12-01  21              5

^{由代表 package (v2.0.0) 于 2021 年 4 月 28 日创建}

Answer 2

这是你想要做的吗？

library(lubridate)
library(dplyr)
df %>% mutate(Date = as.Date(Date)) %>%
  group_by(User, Trial = ifelse(day(Date) > 7 & month(Date) >11, 2, 1)) %>%
  summarise(Age = mean(Age),
            SentimentScore = mean(SentimentScore), .groups = 'drop')

# A tibble: 4 x 4
  User  Trial   Age SentimentScore
  <chr> <dbl> <dbl>          <dbl>
1 a         1    20            1.5
2 a         2    20            3  
3 b         1    19            2.5
4 c         1    21            5

使用的数据

df <- read.table(text = "User   Date    Age SentimentScore
a   2021-09-19  20  1
a   2021-11-20  20  2
a   2021-12-10  20  3
b   2021-09-30  19  1
b   2021-10-01  19  4
c   2021-12-01  21  5", header = T)

按用户和特定日期合并时间序列数据

问题描述

2 个解决方案

解决方案1
1 2021-04-28 06:06:21

解决方案2
0 已采纳 2021-04-28 05:46:32

按用户和特定日期合并时间序列数据

问题描述

2 个解决方案

解决方案1 1 2021-04-28 06:06:21

解决方案2 0 已采纳 2021-04-28 05:46:32

解决方案1
1 2021-04-28 06:06:21

解决方案2
0 已采纳 2021-04-28 05:46:32