繁体   English   中英

R:根据外部条件从数据框中删除行

[英]R: Removing rows from data frame based on external criteria

我有两个数据帧, df.1df.2 ,我想从删除行df.2基于对某些事情是否df.1是真实的。 具体来说,我想从删除所有行df.2其中df.1的价值feistiness对应的datedf.2NA值。 如何做到这一点? (我看过其他问题,但仍然无法解决。)

第一个数据帧的可复制代码:

# create first data frame
dates <- rep(as.Date(5001:5010, origin = "1970-01-01"), times = 4)
dogs <- c(rep("Fido", times = 10), rep("Snoopy", times = 10), rep("Speckles", times = 10), rep("Pit", times = 10))
set.seed(200)
feistiness <- c(round(runif(35, min = 0, max = 100), digits = 0), rep(NA, times = 5))
df.1 <- data.frame(dates, dogs, feistiness)
names(df.1) <- c("date", "dog", "feistiness")

产生:

         date     dog feistiness
1  1983-09-11    Fido         56
2  1983-09-12    Fido         18
3  1983-09-13    Fido         97
4  1983-09-14    Fido         49
5  1983-09-15    Fido         49
6  1983-09-16    Fido         59
7  1983-09-17    Fido         72
8  1983-09-18    Fido         69
9  1983-09-19    Fido         18
10 1983-09-20    Fido         95
11 1983-09-11  Snoopy         69
12 1983-09-12  Snoopy         16
13 1983-09-13  Snoopy         58
14 1983-09-14  Snoopy         65
15 1983-09-15  Snoopy         83
16 1983-09-16  Snoopy          7
17 1983-09-17  Snoopy         12
18 1983-09-18  Snoopy         89
19 1983-09-19  Snoopy         56
20 1983-09-20  Snoopy         52
21 1983-09-11 Speckles         13
22 1983-09-12 Speckles         15
23 1983-09-13 Speckles         16
24 1983-09-14 Speckles         56
25 1983-09-15 Speckles         67
26 1983-09-16 Speckles         15
27 1983-09-17 Speckles         57
28 1983-09-18 Speckles         76
29 1983-09-19 Speckles         57
30 1983-09-20 Speckles         78
31 1983-09-11     Pit         68
32 1983-09-12     Pit         22
33 1983-09-13     Pit         28
34 1983-09-14     Pit          9
35 1983-09-15     Pit         59
36 1983-09-16     Pit         NA
37 1983-09-17     Pit         NA
38 1983-09-18     Pit         NA
39 1983-09-19     Pit         NA
40 1983-09-20     Pit         NA

第二个数据帧:

# create second data frame
dates.2 <- as.Date(c(5002, 5005, 5004, 5009), origin = "1970-01-01")
dogs.2 <- c("Fido", "Snoopy", "Speckles", "Pit")
df.2 <- data.frame(dates.2, dogs.2)
names(df.2) <- c("date", "dog")

产生:

        date      dog
1 1983-09-12     Fido
2 1983-09-15   Snoopy
3 1983-09-14 Speckles
4 1983-09-19      Pit

最终输出数据帧应如下所示,删除最后一行,因为feistiness Pitt的feistiness值为NA

        date      dog
1 1983-09-12     Fido
2 1983-09-15   Snoopy
3 1983-09-14 Speckles

我们可以使用anti_joindplyr df_final是最终输出。

library(dplyr)

df_final <- df.2 %>%
  anti_join(df.1 %>% filter(is.na(feistiness)), by = c("date", "dog"))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM