簡體   English   中英

根據 R 中的兩個匹配條件將一個數據幀中的值添加到另一個數據幀

[英]Adding values from one dataframe to another based on two matching conditions in R

我在下面有兩個數據框:

dput 輸出 df1:

structure(list(Location = c("1100 2ND AVENUE", "1100 2ND AVENUE", 
"1100 2ND AVENUE", "1100 2ND AVENUE", "1100 2ND AVENUE", "1100 2ND AVENUE"
), `Ivend Name` = c("3 Mskt 1.92oz", "Almond Joy 1.61oz", "Aquafina 20oz", 
"BCanyonChptleAdzuk1.5oz", "BlkForest FrtSnk 2.25oz", "BluDimndSmkhseAlmd1.5oz"
), `Category Name` = c("Candy", "Candy", "Water", "Salty Snacks", 
"Candy", "Nuts/Trailmix"), Calories = c(240, 220, 0, 215, 193, 
260), Sugars = c("36", "20", "0", "2", "32", "2"), Month = structure(c(4L, 
4L, 4L, 4L, 4L, 4L), .Label = c("Oct", "Nov", "Dec", "Jan", "Feb", 
"Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep"), class = "factor"), 
    Products_available_per_machine = c(0, 0, 0, 0, 0, 0), Units_sold = c(0, 
    0, 0, 0, 0, 0), Total_Sales = c(0, 0, 0, 0, 0, 0), Spoils = c(0, 
    0, 0, 0, 0, 0), Building = c("1100 2ND", "1100 2ND", "1100 2ND", 
    "1100 2ND", "1100 2ND", "1100 2ND"), Item = structure(c(2L, 
    2L, 1L, 2L, 2L, 2L), .Label = c("Beverage", "Food"), class = "factor"), 
    Year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "2019", class = "factor")), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000233561b1ef0>)

dput 輸出 df2:

structure(list(`Date Ran` = structure(c(1548892800, 1551312000, 
1553817600, 1556582400, 1561680000, 1564531200), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Year = c(2019, 2019, 2019, 2019, 2019, 
2019), Month = c("January", "February", "March", "April", "June", 
"July"), Location = c("SEA18", "SEA18", "SEA18", "SEA18", "SEA18", 
"SEA18"), Building = c("Alexandria", "Alexandria", "Alexandria", 
"Alexandria", "Alexandria", "Alexandria"), Population = c(1177, 
1179, 1178, 1156, 1163, 1163)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

我想從 DF 2 中拉出 pop col 並將其添加到基於“Building”和“Month”的 Dataframe 1,順序是填充到 DF2 中。

我使用 merge 嘗試了這個命令,但是當我執行時 col 為 NULL:

df_2019_final1$Population <- df_2019_pop$Population[match(df_2019_final1$Month, df_2019_pop$Month, df_2019_final1$Building, df_2019_pop$Building)]

subset_df_pop <- df_2019_pop[, c("Month", "Building", "Population")]


updated_2019_test <- merge(df_2019_final1, subset_df_pop, by = c('Month', 'Building'))

兩者都產生 NULLS 和空白 DF

任何幫助將不勝感激。

在其中一個數據集中,“月份”是縮寫,而在第二個數據集中,它是全名。 我們可以調整到這些格式中的任何一種並且merge會起作用

df2$MonthN <- month.abb[match(df2$Month, month.name)]
library(dplyr)
left_join(df1, df2[, c("MonthN", "Building", "Population")], 
             by = c('Month' = 'MonthN', 'Building'))

merge

merge(df1, df2[, c("MonthN", "Building", "Population")], 
   by.x = c('Month', 'Building'), by.y = c('MonthN', 'Building'), all.x = TRUE)

注意:合並數據集上的人口列將根據示例為NA ,因為子集數據集中的“建築”值不同

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM