[英]Only keep the minimum value of each group [duplicate]
我有以下data.table
:-
> dataz <- data.table(group = c("ZAS", "Car", rep("EEE", times = 3), rep("EEff", times = 2), rep("2133", times = 6), "EETTE"),
value = runif(14))
> dataz
group value
1: ZAS 0.27218511
2: Car 0.39520602
3: EEE 0.46775956
4: EEE 0.55071786
5: EEE 0.37529203
6: EEff 0.01471177
7: EEff 0.86282569
8: 2133 0.20789336
9: 2133 0.91272858
10: 2133 0.06315207
11: 2133 0.18178237
12: 2133 0.42354538
13: 2133 0.10176267
14: EETTE 0.88492458
我只想保留group
具有最小值的那些行。
最终的data.table
将采用以下形式:-
group value
1: ZAS 0.27218511
2: Car 0.39520602
3: EEE 0.37529203
4: EEff 0.01471177
5: 2133 0.06315207
6: EETTE 0.88492458
使用.SD
:
dataz[,.SD[value==min(value)],by=.(group)]
group value
<char> <num>
1: ZAS 0.39590814
2: Car 0.42591138
3: EEE 0.07049145
4: EEff 0.34670793
5: 2133 0.05702904
6: EETTE 0.31071582
另一种选择是切片
示例代码:
library(data.table)
library(dplyr)
dataz %>%
group_by(group) %>%
slice(which.min(value))
结果:
group value
<chr> <dbl>
1 2133 0.00592
2 Car 0.418
3 EEE 0.208
4 EEff 0.719
5 EETTE 0.963
6 ZAS 0.769
样本数据:
dataz<-structure(list(group = c("ZAS", "Car", "EEE", "EEE", "EEE", "EEff",
"EEff", "2133", "2133", "2133", "2133", "2133", "2133", "EETTE"
), value = c(0.711316933622584, 0.456328510772437, 0.838366007432342,
0.556059248745441, 0.621371693909168, 0.0612441042903811, 0.391384622780606,
0.986219455022365, 0.771872294368222, 0.54334409092553, 0.122617350192741,
0.195616364479065, 0.705191325163469, 0.940613608341664)), row.names = c(NA,
-14L), class = c("data.table", "data.frame"))
group value
1: ZAS 0.7113169
2: Car 0.4563285
3: EEE 0.8383660
4: EEE 0.5560592
5: EEE 0.6213717
6: EEff 0.0612441
7: EEff 0.3913846
8: 2133 0.9862195
9: 2133 0.7718723
10: 2133 0.5433441
11: 2133 0.1226174
12: 2133 0.1956164
13: 2133 0.7051913
14: EETTE 0.9406136
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.