在2个数据帧之间进行R查找

Question

suppose i have two data frames 假设我有两个数据帧

df1=data.frame(item=c(rep("a",2),rep("b",3),"c","NA",rep("d",4)),
product=paste0("prd",seq(1:11)))
df2=data.frame(item=c("b","d"), price=c(10,20))

for df1, i need to add a col to indicate if it's in df2 item col, as well as for each row, indicate how many products are there, unless it's na,like this 对于df1，我需要添加一个col来指示它是否在df2项col中，以及每一行中，指出其中有多少个产品，除非它是na，像这样

item product#
a    2
a    2
b    3
b    3
b    3

how should i get the product count repeat for each row? 我应该如何使每一行的产品计数重复？

for lookup i'm using 我正在使用的查找

df1$hasDF2=ifelse(is.na(match(df1$item,df2$item)),"N","Y")

is there a more efficient alternative? 有没有更有效的选择？

thanks! 谢谢！

Answer 1

Try: 尝试：

 df1$productNo<- with(df1, ave(seq_along(product), item, FUN=length))
  df1$productNo
  #[1] 2 2 3 3 3 1 1 4 4 4 4

 df1$hasDF2 <- c("N", "Y")[(!is.na(match(df1$item, df2$item))) +1]
 df1$hasDF2
 #[1] "N" "N" "Y" "Y" "Y" "N" "N" "Y" "Y" "Y" "Y"

Or using data.table 或使用data.table

 library(data.table)
 setDT(df1)[,c("produtNo", "hasDF2") := list(.N, "N"), 
              by=item][item %in% df2$item, hasDF2:= "Y"]

Update 更新资料

For the unique count, you could do: 对于unique计数，您可以执行以下操作：

 #creating a dataset with duplicate products
 df1 <- data.frame(item=c(rep("a",2),rep("b",3),"c","NA",rep("d",5)),
 product=paste0("prd",c(1:11,11)))

 setDT(df1)[,c("productNo", "hasDF2") := list(length(unique(product)), "N"), 
          by=item][item %in% df2$item, hasDF2:= "Y"]

在2个数据帧之间进行R查找

问题描述

1 个解决方案

解决方案1
0 已采纳 2014-09-07 09:43:20

Update 更新资料

在2个数据帧之间进行R查找

问题描述

1 个解决方案

解决方案1 0 已采纳 2014-09-07 09:43:20

Update 更新资料

解决方案1
0 已采纳 2014-09-07 09:43:20