简体   繁体   English

R 根据另一个 dataframe 和日期向一个 dataframe 添加一列

[英]R adding a column to one dataframe based on another dataframe and the date

I have a dataframe (Reports_following_AC) where each row represents a report.我有一个 dataframe (Reports_following_AC),其中每一行代表一个报告。 This dataframe looks like this:这个 dataframe 看起来像这样:

> head(Reports_following_AC)

 Park               Month      Obs_con Coy_Season Number_AC Number_4w_AC
  <chr>              <date>       <dbl>      <dbl>     <int>        <int>
1 14st NE - Coventry 2019-06-14       1          2         8            0
2 14st NE - Coventry 2019-10-12       0          3        10            0
3 14st NE - Coventry 2019-10-13       0          3        10            0
4 14st NE - Coventry 2021-06-23       1          2        10            0
5 Airways Park       2020-07-05       0          2         3            0
6 Airways Park       2021-07-18       1          2         6            0

I would like to add a column to my Reports_following_AC dataframe, "Last_treatment", based on the "AC_code" column of the Reaction_per_park_per_day_3 dataframe (below).我想根据 Reaction_per_park_per_day_3 dataframe(下方)的“AC_code”列向我的 Reports_following_AC dataframe 添加一列“Last_treatment”。 In my Reaction_per_park_per_day_3 dataframe, each row represents an AC event.在我的 Reaction_per_park_per_day_3 dataframe 中,每一行代表一个 AC 事件。

The Last_treatment column that would be added to the Reports_following_AC dataframe would represent the "AC_code" (treatment) of the last AC event prior to a report in a Park, if that AC event was done in the 4 weeks (28 days) prior to a report.将添加到 Reports_following_AC dataframe 的 Last_treatment 列将代表公园报告之前最后一次 AC 事件的“AC_code”(治疗),如果该 AC 事件是在 4 周(28 天)之前完成的报告。

> head(Reaction_per_park_per_day_3)
# A tibble: 6 x 10
Park                 Date      AC_code 
<chr>                <date>       <dbl>
1 14st NE - Coventry 2019-06-05       6 
2 14st NE - Coventry 2019-07-12       7
3 14st NE - Coventry 2019-10-05       1
4 14st NE - Coventry 2021-06-18       2 
5 Airways Park       2020-06-26       1
6 Airways Park       2021-06-30       5

The resulting dataframe would therefore look like this:因此,生成的 dataframe 将如下所示:

 Park               Month      Obs_con Coy_Season Number_AC Number_4w_AC  Last_treatment
  <chr>              <date>       <dbl>      <dbl>     <int>        <int>          <dbl>
1 14st NE - Coventry 2019-06-14       1          2         8            0              6 
2 14st NE - Coventry 2019-10-12       0          3        10            0              1  
3 14st NE - Coventry 2019-10-13       0          3        10            0              1
4 14st NE - Coventry 2021-06-23       1          2        10            0             NA
5 Airways Park       2020-07-05       0          2         3            0              1
6 Airways Park       2021-07-18       1          2         6            0              5

I tried the following code, but it's not quite working because instead of providing the AC_Code for the last AC event prior to the reports if within 30 days of the report, it provides the AC_code for all the AC events within 30 days of the report.我尝试了以下代码,但效果不佳,因为它不是在报告后 30 天内为报告之前的最后一个AC 事件提供 AC_Code,而是在报告后 30 天内为所有AC 事件提供 AC_code。

Reports_following_AC_1 <- Reports_following_AC %>%
  left_join(select(Reaction_per_park_per_day_3, c(Park, Date, AC_code))) %>%
              filter(Date <= Month ) %>%
              group_by(Park, Month, Obs_con, Coy_Season) %>%
              mutate(Last_treatment = if_else((Month - max(Date))<28, AC_code, as.character(NA))) %>%
              distinct

> head(Reports_following_AC_1)

  Park               Month      Obs_con Coy_Season Number_AC Number_4w_AC Date       AC_code Last_treatment
  <chr>              <date>       <dbl>      <dbl>     <int>        <int> <date>     <chr>   <chr>         
1 14st NE - Coventry 2019-06-14       1          2         8            0 2019-01-30 3       NA            
2 14st NE - Coventry 2019-06-14       1          2         8            0 2019-01-30 4       NA            
3 14st NE - Coventry 2019-06-14       1          2         8            0 2019-01-30 1       NA            
4 14st NE - Coventry 2019-06-14       1          2         8            0 2019-02-01 4       NA            
5 14st NE - Coventry 2019-06-14       1          2         8            0 2019-02-01 2       NA            
6 14st NE - Coventry 2019-06-14       1          2         8            0 2019-02-04 1       NA

I'm ideally looking for a dplyr solution, but I'm open to other possibilities.我理想地寻找 dplyr 解决方案,但我对其他可能性持开放态度。

you want to join with a selection of columns from Reaction_per_park_per_day_3 if i understand correctly?如果我理解正确的话,您想加入 Reaction_per_park_per_day_3 中的精选列吗? This should work:这应该工作:

Reports_following_AC_1 <- Reports_following_AC %>%
  left_join(select(Reaction_per_park_per_day_3, c(Park,Month,AC_cod), by="Park" ) %>%
  filter(Date <= Month ) %>%
  group_by(Park, Month, Obs_con, Coy_Season) %>%
  mutate(Last_treatment = if_else((Month - max(Date))<28, lag(AC_code), as.character(NA))) %>%
  distinct

I figured it out!我想到了!

Reports_following_AC_1 <- Reports_following_AC %>%
  left_join(select(Reaction_per_park_per_day_3, c(Park, Date, AC_code))) %>%
              filter(Date < Month ) %>%
              group_by(Park, Month, Obs_con, Coy_Season, Number_4w_AC) %>%
              mutate(Last_treatment = last(if_else((Month - max(Date))<28, AC_code, as.character(NA)))) %>%
  select(c(Park, Month, Obs_con, Coy_Season, Number_4w_AC, Last_treatment)) %>%
              distinct

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在一个 dataframe 中创建一个列,基于另一个 dataframe 在 R 中的另一列 - Create a column in one dataframe based on another column in another dataframe in R 根据另一个数据帧中的匹配条件将列添加到 R 中的数据帧 - Adding column to a dataframe in R based on matching conditions in another dataframe R - 将向量从一个 dataframe 作为列添加到另一个 dataframe - R - Adding vector from one dataframe as column to another dataframe 根据另一列中最接近的日期过滤 Dataframe R - Filter Dataframe based on closest date in another column R 如何通过根据另一个数据帧的行名的顺序映射一个数据帧的列名来对 R 中的数据帧进行排序? - How to sort a dataframe in R by mapping column names of one dataframe based on the order of row names of another dataframe? 根据 R 中的另一列 dataframe 替换一列中的值 - Replace values in one column based on another dataframe in R 如何基于另一列的值聚合一列的R数据帧 - How to aggregate R dataframe of one column based on values of another 如何基于一个数据框中的列的值和R中另一个数据框的列标题名称有条件地创建新列 - how to conditionally create new column based on the values of a column in one dataframe and the column header names of another dataframe in R 在 R 中创建基于 dataframe 中的另一列的列 - creating a column that based on another column in dataframe in R 在 dataframe 和 R 日期格式中添加一列 - adding a column in dataframe an R date format
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM