[英]Adding values from one dataframe to another based on two matching conditions in R
I have two dataframes below:我在下面有两个数据框:
dput output df1: dput 输出 df1:
structure(list(Location = c("1100 2ND AVENUE", "1100 2ND AVENUE",
"1100 2ND AVENUE", "1100 2ND AVENUE", "1100 2ND AVENUE", "1100 2ND AVENUE"
), `Ivend Name` = c("3 Mskt 1.92oz", "Almond Joy 1.61oz", "Aquafina 20oz",
"BCanyonChptleAdzuk1.5oz", "BlkForest FrtSnk 2.25oz", "BluDimndSmkhseAlmd1.5oz"
), `Category Name` = c("Candy", "Candy", "Water", "Salty Snacks",
"Candy", "Nuts/Trailmix"), Calories = c(240, 220, 0, 215, 193,
260), Sugars = c("36", "20", "0", "2", "32", "2"), Month = structure(c(4L,
4L, 4L, 4L, 4L, 4L), .Label = c("Oct", "Nov", "Dec", "Jan", "Feb",
"Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep"), class = "factor"),
Products_available_per_machine = c(0, 0, 0, 0, 0, 0), Units_sold = c(0,
0, 0, 0, 0, 0), Total_Sales = c(0, 0, 0, 0, 0, 0), Spoils = c(0,
0, 0, 0, 0, 0), Building = c("1100 2ND", "1100 2ND", "1100 2ND",
"1100 2ND", "1100 2ND", "1100 2ND"), Item = structure(c(2L,
2L, 1L, 2L, 2L, 2L), .Label = c("Beverage", "Food"), class = "factor"),
Year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "2019", class = "factor")), row.names = c(NA,
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000233561b1ef0>)
dput output df2: dput 输出 df2:
structure(list(`Date Ran` = structure(c(1548892800, 1551312000,
1553817600, 1556582400, 1561680000, 1564531200), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), Year = c(2019, 2019, 2019, 2019, 2019,
2019), Month = c("January", "February", "March", "April", "June",
"July"), Location = c("SEA18", "SEA18", "SEA18", "SEA18", "SEA18",
"SEA18"), Building = c("Alexandria", "Alexandria", "Alexandria",
"Alexandria", "Alexandria", "Alexandria"), Population = c(1177,
1179, 1178, 1156, 1163, 1163)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
I want to pull the pop col from DF 2 and add it to Dataframe 1 based on 'Building' and 'Month' in the order population is filled in DF2.我想从 DF 2 中拉出 pop col 并将其添加到基于“Building”和“Month”的 Dataframe 1,顺序是填充到 DF2 中。
I tried this command using merge but the col is NULL when I execute:我使用 merge 尝试了这个命令,但是当我执行时 col 为 NULL:
df_2019_final1$Population <- df_2019_pop$Population[match(df_2019_final1$Month, df_2019_pop$Month, df_2019_final1$Building, df_2019_pop$Building)]
subset_df_pop <- df_2019_pop[, c("Month", "Building", "Population")]
updated_2019_test <- merge(df_2019_final1, subset_df_pop, by = c('Month', 'Building'))
Both produce NULLS and a blank DF两者都产生 NULLS 和空白 DF
Any help would be greatly appreciated.任何帮助将不胜感激。
In one of the datasets, the 'Month' is abbreviated and in the second it is full name.在其中一个数据集中,“月份”是缩写,而在第二个数据集中,它是全名。 We can adjust to either one of those formats and the
merge
would work我们可以调整到这些格式中的任何一种并且
merge
会起作用
df2$MonthN <- month.abb[match(df2$Month, month.name)]
library(dplyr)
left_join(df1, df2[, c("MonthN", "Building", "Population")],
by = c('Month' = 'MonthN', 'Building'))
Or with merge
或
merge
merge(df1, df2[, c("MonthN", "Building", "Population")],
by.x = c('Month', 'Building'), by.y = c('MonthN', 'Building'), all.x = TRUE)
Note: The Population column on the merged dataset will be NA
based on the example as "Building" values are different in the subset datasets注意:合并数据集上的人口列将根据示例为
NA
,因为子集数据集中的“建筑”值不同
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.