简体   繁体   English

如何在dplyr中对多个变量进行排名

[英]How can I rank on more than one variable in dplyr

If I have 如果我有

library(dplyr)
 df <- data.frame(name=c("A","B","C","D"),value1=c(8,9,8,10),value2=c(1,2,3,4))
 df 
      name value1 value2
   1    A      8      1
   2    B      9      2
   3    C      8      3
   4    D     10      4 

 # I want to do something like this without the error

newdf <- df %>%
mutate(rank=row_number(desc(value1),desc(value2)))

newdf
    name value1 value2 rank
1    A      8      1    4
2    B      9      2    2
3    C      8      3    3
4    D     10      4    1

How can I rank the rows based on one column and use a second column in case of ties? 如何在有联系的情况下根据一列对行进行排名并使用第二列?

现在,我进一步看了一下,我认为这可以解决问题

df %>% arrange(desc(value1),desc(value2)) %>% mutate(rank=row_number())

The following codes will produce the same result you posted in the question. 以下代码将产生与问题中相同的结果。 This is what the row_number() will return but your original data don't have to be rearranged. 这是row_number()将返回的内容,但不必重新排列原始数据。

newdf <- df %>%
  mutate(rank=order(-value1,-value2))

Please Note: if you want dense_rank this code will not do that. 请注意:如果您需要dense_rank此代码不会执行该操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM