top_n versus order in r

Question

I am having trouble understanding the output from dplyr's top_n function. Can anybody help?

n=10

df = data.frame(ref=sample(letters,n),score=rnorm(n))

require(dplyr)

print(dplyr::top_n(df,5,score))

print(df[order(df$score,decreasing = T)[1:5],])

The output from top_n is not ordered according to score as I expected. Compare with using the order function

ref      score
1   i 0.71556494
2   p 0.04463846
3   v 0.37290990
4   g 1.53206194
5   f 0.86307107
   ref      score
7    g 1.53206194
10   f 0.86307107
1    i 0.71556494
6    v 0.37290990
4    p 0.04463846

The documentation I have read also implies the top_n results should be ordered by the specified column, for example

https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf

Answer 1

Both outputs are the same, but top_n is not rearranging the rows.

You can get the same result as df[order(df$score,decreasing = T)[1:5],] using arrange()

top_n(df, 5, score) %>% arrange(desc(score))

Flipping the ordering around, df[order(df$score,decreasing = F)[1:5],] is equivalent to top_n(df, -5, score) %>% arrange(score) .

Answer 2

My misunderstanding and expectation was due to my reading of the documentation linked to in the question and described in the comments. Despite some documentation claims, top_n does not generated output ordered by wt .

top_n versus order in r

Question

2 answers

solution1
1 2017-01-25 16:04:51

solution2
0 ACCPTED 2017-01-26 09:03:41

top_n versus order in r

Question

2 answers

solution1 1 2017-01-25 16:04:51

solution2 0 ACCPTED 2017-01-26 09:03:41

solution1
1 2017-01-25 16:04:51

solution2
0 ACCPTED 2017-01-26 09:03:41