[英]Comparing two tables in R to find what products customer is not purchasing
我有兩個表如下:
Cust_list <- data.frame(
stringsAsFactors = FALSE,
Customer = c("Mike S.","Tim P."),
Type = c("Shoes","Socks"),
Product_ID = c(233,6546)
)
Product_Table <- data.frame(
stringsAsFactors = FALSE,
Product_ID = c(233,256,296,8536,6546,8946),
Type = c("Shoes","Shoes","Shoes", "Socks","Socks","Socks")
)
我正在尋找識別“客戶”未在“類型”中購買的“產品 ID”。
例如,Mike S. 在“Shoes”下購買了 Product ID =“233”和 Type =“Shoes”,但沒有購買 Product ID =“256”和“296”。 由於 Mike S. 沒有購買 Type = "Socks",這不會包含在 output 中。
Output 表如下。
select(Cust_list, -Product_ID) %>%
left_join(Product_Table, 'Type')%>%
anti_join(Cust_list)
Customer Type Product_ID
1 Mike S. Shoes 256
2 Mike S. Shoes 296
3 Tim P. Socks 8536
4 Tim P. Socks 8946
這是否有效:
library(dplyr)
library(tidyr)
Cust_list %>% full_join(Product_Table) %>% arrange(Type) %>%
fill(Customer,.direction = 'down') %>% anti_join(Cust_list)
Joining, by = c("Type", "Product_ID")
Joining, by = c("Customer", "Type", "Product_ID")
Customer Type Product_ID
1 Mike S. Shoes 256
2 Mike S. Shoes 296
3 Tim P. Socks 8536
4 Tim P. Socks 8946
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.