How to create a variable based on the number of unique values in another data frame?

Question

This is a simplified example of what I want to do.

Dataset 1 (DF1) has data of apples (like the size or number of holes), and a second dataset (DF2) has information of worms found inside them, including color, and in which apple they were found. What I want to do is to add a variable in DF1 with the number of unique colors (of the worms) that exist in each apple.

DF1<-data.frame(x=c("A1","A2","A3","A4","A5"),y=c(3,26,5,27,5))
DF2<-data.frame(Q=c("A1","A1","A1","A1","A1","A1","A2","A2","A3","A3","A3","A4","A5","A5","A5","A5"),R=c("red","red","blue","yellow","yellow","blue","orange","orange","green","red","red","blue","blue", "purple","black","red"),S=c(4,5,3,5,4,3,5,4,3,5,4,3,5,4,3,5))

I am new in R, and when trying to solve it I thought of:

DF1$N.Colors<-length(unique(DF2$R[match(DF1$X,DF2$Q)]))

But it gives me back a new variable filled with 0s, instead of the wanted vector:

 DF1$N.Colors<-c(3,1,2,1,4)

I'd appreciate very much your help with it

Answer 1

This could be done by making use of join with the 'Q', 'x' columns of both dataset, count the unique values of 'R' and assign it to a new column in 'DF1'

library(data.table)
DF1$N.Colors <- setDT(DF2)[DF1, uniqueN(R), on = .(Q = x), by = .EACHI]$V1

Or using tidyverse

library(dplyr)
DF2 %>%
   group_by(x = Q) %>%
   summarise(N.Colors = n_distinct(R)) %>%
   right_join(DF1)

Answer 2

A base solution with aggregate() and merge() :

merge(DF1, aggregate(N.Colors ~ Q, list(N.Colors = DF2$R, Q = DF2$Q), function(x) length(unique(x))), all.x = T, by.x = "x", by.y = "Q")

#    x  y N.Colors
# 1 A1  3        3
# 2 A2 26        1
# 3 A3  5        2
# 4 A4 27        1
# 5 A5  5        4

How to create a variable based on the number of unique values in another data frame?

Question

2 answers

solution1
3 ACCPTED 2020-08-15 19:11:07

solution2
3 2020-08-15 19:19:39

How to create a variable based on the number of unique values in another data frame?

Question

2 answers

solution1 3 ACCPTED 2020-08-15 19:11:07

solution2 3 2020-08-15 19:19:39

solution1
3 ACCPTED 2020-08-15 19:11:07

solution2
3 2020-08-15 19:19:39