简体   繁体   English

r中的组内排名

[英]rank within groups in r

 > str(b)
'data.frame':   2720 obs. of  3 variables:
$ State        : chr  "AL" "AL" "AL" "AL" ...
$ Hospital.Name: chr  "SOUTHEAST ALABAMA MEDICAL CENTER" "MARSHALL MEDICAL 
CENTER SOUTH" "ELIZA COFFEE MEMORIAL HOSPITAL" "ST VINCENT'S EAST" ...
$ heart attack : num  14.3 18.5 18.1 17.7 18 15.9 19.6 17.3 17.8 17.5 ...

Above is my data frame. 以上是我的数据框架。 I want to group it by state and conduct a rank by heart attack within each group, so my code like this: 我希望按州分组并在每个组内进行心脏病发作,因此我的代码如下:

c <- group_by(b,State) %>%
    mutate(rank = order(order('heart attack')))

but I got a result with all the values in rank column equal 1: 但我得到一个结果,排名列中的所有值都等于1:

> c
# A tibble: 2,720 x 4
# Groups:   State [54]
   State                    Hospital.Name `heart attack`  rank
   <chr>                            <chr>          <dbl> <int>
 1    AL SOUTHEAST ALABAMA MEDICAL CENTER           14.3     1
 2    AL    MARSHALL MEDICAL CENTER SOUTH           18.5     1
 3    AL   ELIZA COFFEE MEMORIAL HOSPITAL           18.1     1
 4    AL                ST VINCENT'S EAST           17.7     1
 5    AL   DEKALB REGIONAL MEDICAL CENTER           18.0     1
 6    AL    SHELBY BAPTIST MEDICAL CENTER           15.9     1
 7    AL   HELEN KELLER MEMORIAL HOSPITAL           19.6     1
 8    AL              DALE MEDICAL CENTER           17.3     1
 9    AL     BAPTIST MEDICAL CENTER SOUTH           17.8     1
10    AL    JACKSON HOSPITAL & CLINIC INC           17.5     1
# ... with 2,710 more rows

Can anyone help me to figure out why it doesn't work? 任何人都可以帮我弄清楚它为什么不起作用?

The comment from alistaire is good and often I find that stepping through line by line on a chained dplyr command like this is quite useful in debugging. 来自alistaire的评论很好,我经常发现在这样的链式dplyr命令上逐行逐步调试在调试时非常有用。 I used iris as a sample dataset: 我使用iris作为样本数据集:

library(dplyr)

temp <- iris %>%
  group_by(Species) %>%
  arrange(Sepal.Length) %>%
  mutate(rank = order(Sepal.Length))

Returns 返回

R> head(temp)
# A tibble: 6 x 6
# Groups:   Species [1]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species  rank
         <dbl>       <dbl>        <dbl>       <dbl>  <fctr> <int>
1          4.3         3.0          1.1         0.1  setosa     1
2          4.4         2.9          1.4         0.2  setosa     2
3          4.4         3.0          1.3         0.2  setosa     3
4          4.4         3.2          1.3         0.2  setosa     4
5          4.5         2.3          1.3         0.3  setosa     5
6          4.6         3.1          1.5         0.2  setosa     6

You can also use the rank() function in R: 您还可以使用R中的rank()函数:

temp2 <- iris %>%
group_by(Species) %>%
mutate(rank = rank(Sepal.Length))

R> head(temp2)
# A tibble: 6 x 6
# Groups:   Species [1]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species  rank
         <dbl>       <dbl>        <dbl>       <dbl>  <fctr> <dbl>
1          5.1         3.5          1.4         0.2  setosa  32.5
2          4.9         3.0          1.4         0.2  setosa  18.5
3          4.7         3.2          1.3         0.2  setosa  10.5
4          4.6         3.1          1.5         0.2  setosa   7.5
5          5.0         3.6          1.4         0.2  setosa  24.5
6          5.4         3.9          1.7         0.4  setosa  43.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM