Dataframe1
has two columns: num_movies and userId. Dataframe2
has two columns: No_movies and userId. But Dataframe2
has 2106 rows and Dataframe1
has 1679 rows. I want to subtract the number of movies in Dataframe2
from Dataframe1
based on matching userId values. I have written the following line:
df1$num_movies = df1$num_movies - df2$No_movies[df1$userId %in% df2$userId]
and I get the following error:
Error in `$<-.data.frame`(`*tmp*`, "num_movies", value = c(2, 9, 743, :
replacement has 2106 rows, data has 1679
In addition: Warning message:
In df1$num_movies - df2$No_movies[df1$userId %in% :
longer object length is not a multiple of shorter object length
Elsewhere it has been proposed that I upgrade from 3.0.2 to 3.1.2 to solve this problem. But I still have the same error after the upgrade. What I have written seems logical for me. I intend to pick only 1679 userIds out of 2106. Why is it selecting all of them? How do I circumvent this error?
You can use the match
function to find the corresponding row from Dataframe2
for each row in Dataframe1
.
matched.movies <- Dataframe2$No_movies[match(Dataframe1$userId, Dataframe2$userId)]
matched.movies[is.na(matched.movies)] <- 0
Dataframe1$num_movies <- Dataframe1$num_movies - matched.movies
Dataframe1
# num_movies userId
# 1 10 1
# 2 7 2
# 3 6 3
Data:
(Dataframe1 <- data.frame(num_movies=rep(10, 3), userId=1:3))
# num_movies userId
# 1 10 1
# 2 10 2
# 3 10 3
(Dataframe2 <- data.frame(No_movies=2:6, userId=c(0, 2, 3, 9, 10)))
# No_movies userId
# 1 2 0
# 2 3 2
# 3 4 3
# 4 5 9
# 5 6 10
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.