I have two data frames, df.1
and df.2
, and I'd like to remove rows from df.2
based on whether certain things about df.1
are true. Specifically, I want to delete all rows from df.2
where the df.1
value of feistiness
corresponding to the date
in df.2
has an NA
value. How does one go about doing this? (I've looked at other questions and still couldn't figure this out.)
Reproducible code for the first data frame:
# create first data frame
dates <- rep(as.Date(5001:5010, origin = "1970-01-01"), times = 4)
dogs <- c(rep("Fido", times = 10), rep("Snoopy", times = 10), rep("Speckles", times = 10), rep("Pit", times = 10))
set.seed(200)
feistiness <- c(round(runif(35, min = 0, max = 100), digits = 0), rep(NA, times = 5))
df.1 <- data.frame(dates, dogs, feistiness)
names(df.1) <- c("date", "dog", "feistiness")
Which yields:
date dog feistiness
1 1983-09-11 Fido 56
2 1983-09-12 Fido 18
3 1983-09-13 Fido 97
4 1983-09-14 Fido 49
5 1983-09-15 Fido 49
6 1983-09-16 Fido 59
7 1983-09-17 Fido 72
8 1983-09-18 Fido 69
9 1983-09-19 Fido 18
10 1983-09-20 Fido 95
11 1983-09-11 Snoopy 69
12 1983-09-12 Snoopy 16
13 1983-09-13 Snoopy 58
14 1983-09-14 Snoopy 65
15 1983-09-15 Snoopy 83
16 1983-09-16 Snoopy 7
17 1983-09-17 Snoopy 12
18 1983-09-18 Snoopy 89
19 1983-09-19 Snoopy 56
20 1983-09-20 Snoopy 52
21 1983-09-11 Speckles 13
22 1983-09-12 Speckles 15
23 1983-09-13 Speckles 16
24 1983-09-14 Speckles 56
25 1983-09-15 Speckles 67
26 1983-09-16 Speckles 15
27 1983-09-17 Speckles 57
28 1983-09-18 Speckles 76
29 1983-09-19 Speckles 57
30 1983-09-20 Speckles 78
31 1983-09-11 Pit 68
32 1983-09-12 Pit 22
33 1983-09-13 Pit 28
34 1983-09-14 Pit 9
35 1983-09-15 Pit 59
36 1983-09-16 Pit NA
37 1983-09-17 Pit NA
38 1983-09-18 Pit NA
39 1983-09-19 Pit NA
40 1983-09-20 Pit NA
And the second data frame:
# create second data frame
dates.2 <- as.Date(c(5002, 5005, 5004, 5009), origin = "1970-01-01")
dogs.2 <- c("Fido", "Snoopy", "Speckles", "Pit")
df.2 <- data.frame(dates.2, dogs.2)
names(df.2) <- c("date", "dog")
Which yields:
date dog
1 1983-09-12 Fido
2 1983-09-15 Snoopy
3 1983-09-14 Speckles
4 1983-09-19 Pit
The final output data frame should look the following, with the last row removed because the feistiness
value for Pitt
at 1983-09-19 is NA
:
date dog
1 1983-09-12 Fido
2 1983-09-15 Snoopy
3 1983-09-14 Speckles
We can use anti_join
from dplyr
. df_final
is the final output.
library(dplyr)
df_final <- df.2 %>%
anti_join(df.1 %>% filter(is.na(feistiness)), by = c("date", "dog"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.