根据 R 中的两个匹配条件将一个数据帧中的值添加到另一个数据帧

Question

I have two dataframes below:我在下面有两个数据框：

dput output df1: dput 输出 df1：

structure(list(Location = c("1100 2ND AVENUE", "1100 2ND AVENUE", 
"1100 2ND AVENUE", "1100 2ND AVENUE", "1100 2ND AVENUE", "1100 2ND AVENUE"
), `Ivend Name` = c("3 Mskt 1.92oz", "Almond Joy 1.61oz", "Aquafina 20oz", 
"BCanyonChptleAdzuk1.5oz", "BlkForest FrtSnk 2.25oz", "BluDimndSmkhseAlmd1.5oz"
), `Category Name` = c("Candy", "Candy", "Water", "Salty Snacks", 
"Candy", "Nuts/Trailmix"), Calories = c(240, 220, 0, 215, 193, 
260), Sugars = c("36", "20", "0", "2", "32", "2"), Month = structure(c(4L, 
4L, 4L, 4L, 4L, 4L), .Label = c("Oct", "Nov", "Dec", "Jan", "Feb", 
"Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep"), class = "factor"), 
    Products_available_per_machine = c(0, 0, 0, 0, 0, 0), Units_sold = c(0, 
    0, 0, 0, 0, 0), Total_Sales = c(0, 0, 0, 0, 0, 0), Spoils = c(0, 
    0, 0, 0, 0, 0), Building = c("1100 2ND", "1100 2ND", "1100 2ND", 
    "1100 2ND", "1100 2ND", "1100 2ND"), Item = structure(c(2L, 
    2L, 1L, 2L, 2L, 2L), .Label = c("Beverage", "Food"), class = "factor"), 
    Year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "2019", class = "factor")), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000233561b1ef0>)

dput output df2: dput 输出 df2：

structure(list(`Date Ran` = structure(c(1548892800, 1551312000, 
1553817600, 1556582400, 1561680000, 1564531200), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Year = c(2019, 2019, 2019, 2019, 2019, 
2019), Month = c("January", "February", "March", "April", "June", 
"July"), Location = c("SEA18", "SEA18", "SEA18", "SEA18", "SEA18", 
"SEA18"), Building = c("Alexandria", "Alexandria", "Alexandria", 
"Alexandria", "Alexandria", "Alexandria"), Population = c(1177, 
1179, 1178, 1156, 1163, 1163)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

I want to pull the pop col from DF 2 and add it to Dataframe 1 based on 'Building' and 'Month' in the order population is filled in DF2.我想从 DF 2 中拉出 pop col 并将其添加到基于“Building”和“Month”的 Dataframe 1，顺序是填充到 DF2 中。

I tried this command using merge but the col is NULL when I execute:我使用 merge 尝试了这个命令，但是当我执行时 col 为 NULL：

df_2019_final1$Population <- df_2019_pop$Population[match(df_2019_final1$Month, df_2019_pop$Month, df_2019_final1$Building, df_2019_pop$Building)]

subset_df_pop <- df_2019_pop[, c("Month", "Building", "Population")]


updated_2019_test <- merge(df_2019_final1, subset_df_pop, by = c('Month', 'Building'))

Both produce NULLS and a blank DF两者都产生 NULLS 和空白 DF

Any help would be greatly appreciated.任何帮助将不胜感激。

Answer 1

In one of the datasets, the 'Month' is abbreviated and in the second it is full name.在其中一个数据集中，“月份”是缩写，而在第二个数据集中，它是全名。 We can adjust to either one of those formats and the merge would work我们可以调整到这些格式中的任何一种并且merge会起作用

df2$MonthN <- month.abb[match(df2$Month, month.name)]
library(dplyr)
left_join(df1, df2[, c("MonthN", "Building", "Population")], 
             by = c('Month' = 'MonthN', 'Building'))

Or with merge或merge

merge(df1, df2[, c("MonthN", "Building", "Population")], 
   by.x = c('Month', 'Building'), by.y = c('MonthN', 'Building'), all.x = TRUE)

Note: The Population column on the merged dataset will be NA based on the example as "Building" values are different in the subset datasets注意：合并数据集上的人口列将根据示例为NA ，因为子集数据集中的“建筑”值不同

根据 R 中的两个匹配条件将一个数据帧中的值添加到另一个数据帧

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-01-31 20:36:04

根据 R 中的两个匹配条件将一个数据帧中的值添加到另一个数据帧

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-01-31 20:36:04

解决方案1
1 已采纳 2020-01-31 20:36:04