[英]Create indicator for two highest observations?
我正在使用以下数据框:
Month Week Y Name Color
January 2 1.2 Joe Red
January 2 3.3 Eric Red
January 2 4.5 Mike Blue
January 2 1.7 Brian Blue
January 2 2.9 Pete Red
January 3 4.6 Joe Red
January 3 5.1 Eric Blue
January 3 2.1 Mike Blue
January 3 6.9 Pete Red
...
我想创建一个新列(“最高”),它标识在给定周内具有两个最高 Y 值的个人(用 A 和 B 标识他们,这样以后在我的项目中创建线段会更容易)谁也有颜色'蓝色'。
Month Week Y Name Highest
January 2 1.2 Joe -
January 2 3.3 Eric B
January 2 4.5 Mike A
January 2 1.7 Brian -
January 2 2.9 Pete -
January 3 4.6 Joe -
January 3 5.1 Eric B
January 3 2.1 Mike A
January 3 6.9 Pete -
...
此外,如您在上表中所见,我希望“最高”列在整个月内保持相同——该列应显示在给定月份的所有观察中在第 2 周具有最高两个 Y 值的个人。 我假设这将需要group_by(Month, Week) %>%
您可以按Y
值arrange
数据并将'A'
、 'B'
分配给前两个值。
library(dplyr)
df %>%
arrange(Month, Week, desc(Y)) %>%
group_by(Month, Week) %>%
mutate(Highest = c('A', 'B', rep(NA, n()-2)))
#If you want to have '-' instead of `NA`.
#mutate(Highest = c('A', 'B', rep('-', n()-2)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.