[英]Create indicator for two highest observations?
我正在使用以下數據框:
Month Week Y Name Color
January 2 1.2 Joe Red
January 2 3.3 Eric Red
January 2 4.5 Mike Blue
January 2 1.7 Brian Blue
January 2 2.9 Pete Red
January 3 4.6 Joe Red
January 3 5.1 Eric Blue
January 3 2.1 Mike Blue
January 3 6.9 Pete Red
...
我想創建一個新列(“最高”),它標識在給定周內具有兩個最高 Y 值的個人(用 A 和 B 標識他們,這樣以后在我的項目中創建線段會更容易)誰也有顏色'藍色'。
Month Week Y Name Highest
January 2 1.2 Joe -
January 2 3.3 Eric B
January 2 4.5 Mike A
January 2 1.7 Brian -
January 2 2.9 Pete -
January 3 4.6 Joe -
January 3 5.1 Eric B
January 3 2.1 Mike A
January 3 6.9 Pete -
...
此外,如您在上表中所見,我希望“最高”列在整個月內保持相同——該列應顯示在給定月份的所有觀察中在第 2 周具有最高兩個 Y 值的個人。 我假設這將需要group_by(Month, Week) %>%
您可以按Y
值arrange
數據並將'A'
、 'B'
分配給前兩個值。
library(dplyr)
df %>%
arrange(Month, Week, desc(Y)) %>%
group_by(Month, Week) %>%
mutate(Highest = c('A', 'B', rep(NA, n()-2)))
#If you want to have '-' instead of `NA`.
#mutate(Highest = c('A', 'B', rep('-', n()-2)))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.