簡體   English   中英

R中兩個數據幀上的條件JOIN

[英]Conditional JOIN on two data frames in R

假設有兩個數據幀,如下所示(從本文中得出):

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))

df1
#  CustomerId Product
#           1 Toaster
#           2 Toaster
#           3 Toaster
#           4   Radio
#           5   Radio
#           6   Radio

df2
#  CustomerId   State
#           2 Alabama
#           4 Alabama
#           6    Ohio

問題是如何在R中執行以下sql查詢:

SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId

我所知道的是,我可以使用merge(df1, df2, by = "CustomerId")進行內部merge(df1, df2, by = "CustomerId") 但是不滿足加入條件。

這是一種令人困惑的方法。 但是它可以工作:

library(tidyverse)
df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))

map2_df(
  df1$CustomerId, df1$Product,
  .f = ~ {
    temp <- df2 %>% filter(.x <= CustomerId)
    tibble(CustomerId.x = .x, Product = .y, 
           CustomerId.y = temp$CustomerId, State = temp$State)
  }
)

正如我在親愛的Grothendieck的評論中所發現的那樣,一個簡單的解決方案是使用sqldf軟件包,並以sql格式獲取我的結果:

library(sqldf)
sqldf("SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM