在一个 dataframe 中创建一个列，基于另一个 dataframe 在 R 中的另一列

Question

I am fairly new to R and DPLYR and I am stuck on a this issue:我对 R 和 DPLYR 相当陌生，我被困在这个问题上：

I have two tables:我有两张桌子：

(1) Repairs done on cars （一）汽车维修

(2) Amount owed on each car over time (2) 每辆车的欠款随着时间的推移

What I would like to do is create three extra columns on the repair table that gives me: (1) the amount owed on the car when the repair was done, (2) 3months down the road and (3) finally last payment record on file.我想做的是在维修表上创建三个额外的列，这给了我：（1）维修完成时欠汽车的金额，（2）3个月的路和（3）最后的付款记录文件。

And if the case where the repair date does not match with any payment record, I need to use the closest amount owed on record.如果维修日期与任何付款记录不匹配，我需要使用记录中最接近的欠款金额。

So something like:所以像：

Any ideas how I can do that?任何想法我该怎么做？

Here are the data frames:以下是数据框：

Repairs done on cars:汽车维修：

 df_repair <- data.frame(unique_id = 
 c("A1","A2","A3","A4","A5","A6","A7","A8"),
 car_number = c(1,1,1,2,2,2,3,3),
 repair_done = c("Front Fender","Front 
 Lights","Rear Lights","Front Fender", "Rear Fender","Rear Lights","Front 
 Lights","Front Fender"),
 YearMonth = c("2014-03","2016-03","2016-07","2015-05","2015-08","2016-01","2018-01","2018-05"))


df_owed <- data.frame(car_number = c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,3),
                      YearMonth = c("2014-02","2014-05","2014-06","2014-08","2015-06","2015-12","2016-03","2016-04","2016-05","2016-06","2016-07","2016-08","2015-05","2015-08","2015-12","2016-03","2018-01","2018-02","2018-03","2018-04","2018-05","2018-09"),

amount_owed = c(20000,18000,17500,16000,10000,7000,6000,5500,5000,4500,4000,3000,10000,8000,6000,0,50000,40000,35000,30000,25000,15000))

Answer 1

Using zoo for year-months, and tidyverse , you could try the following.使用zoo for year-month 和tidyverse ，您可以尝试以下操作。 Using left_join add all the df_owed data to your df_repair data, by the car_number .使用left_join将所有df_owed数据添加到您的df_repair数据中，按car_number 。 You can convert your year-month columns to yearmon objects with zoo .您可以使用zoo将年月列转换为yearmon对象。 Then, sort your rows by the year-month column from df_owed .然后，按df_owed中的年月列对行进行排序。

For each unique_id (using group_by ) you can create your three columns of interest.对于每个unique_id （使用group_by ），您可以创建您感兴趣的三列。 The first will use the latest amount_owed where the owed date is prior to the service date.第一个将使用最新的amount_owed ，其中欠款日期早于服务日期。 Then second (3 months) will use the first amount_owed value where the owed date follows the service date by 3 months (3/12).然后第二个（3 个月）将使用第一个amount_owed值，其中欠款日期比服务日期晚 3 个月（3/12）。 Finally, the most recent take just the last value from amount_owed .最后，最近的只是从amount_owed中获取的last值。

Using the example data, the results differ a bit, possibly due to the data frames not matching the images in the post.使用示例数据，结果略有不同，可能是由于数据帧与帖子中的图像不匹配。

library(tidyverse)
library(zoo)

df_repair %>%
  left_join(df_owed, by = "car_number") %>%
  mutate_at(c("YearMonth.x", "YearMonth.y"), as.yearmon) %>%
  arrange(YearMonth.y) %>%
  group_by(unique_id, car_number) %>%
  summarise(
    owed_repair_done = last(amount_owed[YearMonth.y <= YearMonth.x]),
    owed_3_months = first(amount_owed[YearMonth.y >= YearMonth.x + 3/12]),
    owed_most_recent = last(amount_owed)
  )

在一个 dataframe 中创建一个列，基于另一个 dataframe 在 R 中的另一列

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-02-25 00:21:44

在一个 dataframe 中创建一个列，基于另一个 dataframe 在 R 中的另一列

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-02-25 00:21:44

解决方案1
1 已采纳 2021-02-25 00:21:44