[英]subset dataframe based on hierarchical preference of factor levels within column in R
I have a dataframe which I would like to subset based on hierarchical preference of factor levels within a column.我有一个 dataframe ,我想根据列中因子水平的分层偏好对其进行子集化。 With following example I want to show, that per level of "ID" I want to select only one "method".
通过以下示例,我想展示每个级别的“ID”我想 select 只有一个“方法”。 Specifically, if possible keeping CACL, if CACL doesn't exist for this level, then subset for "KCL" and if that doesn't exist, then subset for "H2O".
具体来说,如果可能保留 CACL,如果此级别不存在 CACL,则为“KCL”子集,如果不存在,则为“H2O”子集。
ID<-c(1,1,1,2,2,3)
method<-c("CACL","KCL","H2O","H2O","KCL","H2O")
df1<-data.frame(ID,method)
ID method
1 1 CACL
2 1 KCL
3 1 H2O
4 2 H2O
5 2 KCL
6 3 H2O
ID<-c(1,2,3)
method<-c("CACL","KCL","H2O")
df2<-data.frame(ID,method)
ID method
1 1 CACL
2 2 KCL
3 3 H2O
I have done something similar subsetting by selecting a minimum number within a level, but am not able to adapt it.我通过在一个级别中选择一个最小数字来完成类似的子集化,但我无法适应它。 Am wondering whether I should use ifelse here too?
我想知道我是否也应该在这里使用 ifelse ?
#if present, choose rows containing "number" 2 instead of 1 (this column contained only the two numbers 1 and 2)
library(dplyr)
new<-df %>%
group_by(col1,col2,col3) %>%
summarize(number = ifelse(any(number > 1), min(number[number>1]),1))
dfnew<-merge(new,df,by=c("colxyz","number"),all.x=T)
You can use order
with match
and then simply !duplicated
:您可以将
order
与match
一起使用,然后简单地!duplicated
:
df1 <- df1[order(match(df1$method, c("CACL","KCL","H2O"))),]
df1[!duplicated(df1$ID),]
# ID method
#1 1 CACL
#5 2 KCL
#6 3 H2O
#Variant not changing df1
i <- order(match(df1$method, c("CACL","KCL","H2O")))
df1[i[!duplicated(df1$ID[i])],]
An option using dplyr
:使用
dplyr
的选项:
df1 %>%
mutate(preference = match(method, c("CACL","KCL","H2O"))) %>%
group_by(ID) %>%
filter(preference == min(preference)) %>%
select(-preference)
# A tibble: 3 x 2
# Groups: ID [3]
ID method
<dbl> <fct>
1 1 CACL
2 2 KCL
3 3 H2O
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.