简体   繁体   English

Select 组中的行基于 data.table 中列值的优先级

[英]Select row in a group based on priority of column values in data.table

I have a data.table as follows -我有一个data.table如下 -

temp_dt = structure(list(group = c("A", "A", "B", "C", "D", 
"D", "E", "E"), value = c(28.395, 26.206, 64.032, 
7.588961, 0.053089, 0.053089, 0.795798, 0.795798), type = c("R", 
"P", "R", "R", "R", "P", "R", "P")), row.names = c(NA, -8L), class = c("data.table", 
"data.frame"))

> temp_dt
   group     value type
1:     A 28.395000    R
2:     A 26.206000    P
3:     B 64.032000    R
4:     C  7.588961    R
5:     D  0.053089    R
6:     D  0.053089    P
7:     E  0.795798    R
8:     E  0.795798    P

I want to subset the data.table temp_dt such that when a group has both types R and P , then row with type R is selected.我想对data.table temp_dt进行子集化,这样当一个组同时具有RP两种类型时,就会选择类型为R的行。 If the group has either R or P , then whatever is available is selected.如果组有RP ,则选择任何可用的。

A possible solution:一个可能的解决方案:

library(data.table)
temp_dt[order(-type),.SD[1,],by=group]

    group     value   type
   <char>     <num> <char>
1:      A 26.206000      R
2:      B 64.032000      R
3:      C  7.588961      R
4:      D  0.053089      R
5:      E  0.795798      R
temp_11 <- table(temp_dt[ ,-2])
temp_11 <- as.data.table(temp_11)

temp_12 <- temp_11 %>%
  group_by(group)%>%
  summarise(Select = sum(N))

temp_12$Select <- as.character(temp_12$Select)

for (i in 1:nrow(temp_12)){
  if((temp_12[i , 2] == 2)){
    temp_12[i , 2] = "R"
  }else{
    temp_12[i , 2] = "P"
  }
}

temp_12

It will give this:它会给这个:

group Select
<chr> <chr> 
1 A     R     
2 B     P     
3 C     P     
4 D     R     
5 E     R     

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM