简体   繁体   English

在一个 dataframe 中创建一个列,基于另一个 dataframe 在 R 中的另一列

[英]Create a column in one dataframe based on another column in another dataframe in R

I am fairly new to R and DPLYR and I am stuck on a this issue:我对 R 和 DPLYR 相当陌生,我被困在这个问题上:

I have two tables:我有两张桌子:

(1) Repairs done on cars (一)汽车维修

在此处输入图像描述

(2) Amount owed on each car over time (2) 每辆车的欠款随着时间的推移

在此处输入图像描述

What I would like to do is create three extra columns on the repair table that gives me: (1) the amount owed on the car when the repair was done, (2) 3months down the road and (3) finally last payment record on file.我想做的是在维修表上创建三个额外的列,这给了我:(1)维修完成时欠汽车的金额,(2)3个月的路和(3)最后的付款记录文件。

And if the case where the repair date does not match with any payment record, I need to use the closest amount owed on record.如果维修日期与任何付款记录不匹配,我需要使用记录中最接近的欠款金额。

So something like:所以像:

在此处输入图像描述

Any ideas how I can do that?任何想法我该怎么做?

Here are the data frames:以下是数据框:

Repairs done on cars:汽车维修:

 df_repair <- data.frame(unique_id = 
 c("A1","A2","A3","A4","A5","A6","A7","A8"),
 car_number = c(1,1,1,2,2,2,3,3),
 repair_done = c("Front Fender","Front 
 Lights","Rear Lights","Front Fender", "Rear Fender","Rear Lights","Front 
 Lights","Front Fender"),
 YearMonth = c("2014-03","2016-03","2016-07","2015-05","2015-08","2016-01","2018-01","2018-05"))


df_owed <- data.frame(car_number = c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,3),
                      YearMonth = c("2014-02","2014-05","2014-06","2014-08","2015-06","2015-12","2016-03","2016-04","2016-05","2016-06","2016-07","2016-08","2015-05","2015-08","2015-12","2016-03","2018-01","2018-02","2018-03","2018-04","2018-05","2018-09"),

amount_owed = c(20000,18000,17500,16000,10000,7000,6000,5500,5000,4500,4000,3000,10000,8000,6000,0,50000,40000,35000,30000,25000,15000))

Using zoo for year-months, and tidyverse , you could try the following.使用zoo for year-month 和tidyverse ,您可以尝试以下操作。 Using left_join add all the df_owed data to your df_repair data, by the car_number .使用left_join将所有df_owed数据添加到您的df_repair数据中,按car_number You can convert your year-month columns to yearmon objects with zoo .您可以使用zoo将年月列转换为yearmon对象。 Then, sort your rows by the year-month column from df_owed .然后,按df_owed中的年月列对行进行排序。

For each unique_id (using group_by ) you can create your three columns of interest.对于每个unique_id (使用group_by ),您可以创建您感兴趣的三列。 The first will use the latest amount_owed where the owed date is prior to the service date.第一个将使用最新的amount_owed ,其中欠款日期早于服务日期。 Then second (3 months) will use the first amount_owed value where the owed date follows the service date by 3 months (3/12).然后第二个(3 个月)将使用第一个amount_owed值,其中欠款日期比服务日期晚 3 个月(3/12)。 Finally, the most recent take just the last value from amount_owed .最后,最近的只是从amount_owed中获取的last值。

Using the example data, the results differ a bit, possibly due to the data frames not matching the images in the post.使用示例数据,结果略有不同,可能是由于数据帧与帖子中的图像不匹配。

library(tidyverse)
library(zoo)

df_repair %>%
  left_join(df_owed, by = "car_number") %>%
  mutate_at(c("YearMonth.x", "YearMonth.y"), as.yearmon) %>%
  arrange(YearMonth.y) %>%
  group_by(unique_id, car_number) %>%
  summarise(
    owed_repair_done = last(amount_owed[YearMonth.y <= YearMonth.x]),
    owed_3_months = first(amount_owed[YearMonth.y >= YearMonth.x + 3/12]),
    owed_most_recent = last(amount_owed)
  )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何基于一个数据框中的列的值和R中另一个数据框的列标题名称有条件地创建新列 - how to conditionally create new column based on the values of a column in one dataframe and the column header names of another dataframe in R 基于另一个数据框 col 创建一列 - Create one column based on another dataframe col 根据 R 中的列值,基于现有 dataframe 创建另一个 dataframe - Create another dataframe based on an existing dataframe based on a column value in R 根据其他数据框创建列 - Create column based on another dataframe 基于另一个在 dataframe 中创建新列,并与 R 中的另一个数据集匹配 - Create new column in dataframe based on another and matching to another dataset in R 在 R 中创建基于 dataframe 中的另一列的列 - creating a column that based on another column in dataframe in R 根据 R 中的另一列 dataframe 替换一列中的值 - Replace values in one column based on another dataframe in R 如何基于另一列的值聚合一列的R数据帧 - How to aggregate R dataframe of one column based on values of another 根据r中另一个数据框中的列填充数据框中的列 - Filling a column in a dataframe based on a column in another dataframe in r 如何根据R中另一个dataframe中的列删除列dataframe中的行? - How to delete rows in a column dataframe based on the column in another dataframe in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM