简体   繁体   English

考虑对列进行分组并根据R中的其他列选择行

[英]Consider grouping for a column and selecting rows based on other columns in R

My data is dataframe(fpo): 我的数据是dataframe(fpo):

     damIDpoG4 damSirepoG4 damGpoG4 damPhenpoG4 damTBVpoG4 damGBVpoG4
[1,]    450622      430878        4    5.540501   4.260957   3.422568
[2,]    450623      430878        4    3.046358   4.169094   3.528200
[3,]    450625      430878        4    4.515801   4.543196   3.843761
....
[50,]    450626      470878        4    4.798896   4.501067   3.875034
[51,]    450630      470878        4    4.282659   4.388037   3.830042
[52,]    450632      470878        4    3.553223   4.086484   3.571130

I want to select n number (for example 12) from damIDpoG4 for per similar group of damSirepoG4 according to MAX and or 20% damGBVpoG4 . 我想根据MAX和20% damGBVpoG4为每个类似的damSirepoG4组从damIDpoG4中选择n个数字(例如12)。 damSirepoG4 contain 250 groups of identical numbers I try: 我尝试的damSirepoG4包含250组相同的数字:

fpo %>% group_by(fpo[,2]) %>% sample_n(12)

but my answer is not correct. 但我的答案不正确。 I could not consider max or percent for dplyr 我无法考虑dplyr的最大值或百分比
thanks for attention 感谢您的关注

We need to pass the column name in group_by (assuming that 'fpo' is data.frame/tbl_df and not a matrix ) 我们需要在group_by传递列名(假设'fpo'是data.frame/tbl_df而不是matrix

fpo %>% 
    group_by(damSirepoG4) %>%
    sample_n(12)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM