简体   繁体   中英

Show top n=2 of column B, when top appear more than one time

I have a dataset as follows

df <- tibble(
   X = c("1", "1", "3", "3", "5", "5", "6", "6", "6"),
          Y = c("X", "Y", "X", "Z","Z" "Y", "Y", "X", "Z"))

and want to have this output, meaning I always want the two highest values of column X to be "filtered out", without filtering it on a specific value but rather as a general command.

output <-tibble( 
X = c("6", "6", "6", "5", "5"),
Y = c("Z", "X", "Y", "Y", "Z"))

I tried top_n but it does not seem to work, when the top n value appear multiple times.

You can combine unique , sort and tail to get the second largest number and use this to subset using >= .

df[df$X >= tail(sort(unique(df$X)), 2)[1],]
#  X Y
#5 5 Z
#6 5 Y
#7 6 Y
#8 6 X
#9 6 Z

Data

df <- data.frame(
  X = c(1, 1, 3, 3, 5, 5, 6, 6, 6),
  Y = c("X", "Y", "X", "Z", "Z", "Y", "Y", "X", "Z"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM