简体   繁体   中英

Sorting an array of strings in Ruby

I have learned two array sorting methods in Ruby:

array = ["one", "two", "three"]
array.sort.reverse!

or:

array = ["one", "two", "three"]
array.sort { |x,y| y<=>x }

And I am not able to differentiate between the two. Which method is better and how exactly are they different in execution?

Both lines do the same (create a new array, which is reverse sorted). The main argument is about readability and performance. array.sort.reverse! is more readable than array.sort{|x,y| y<=>x} array.sort{|x,y| y<=>x} - I think we can agree here.

For the performance part, I created a quick benchmark script, which gives the following on my system ( ruby 1.9.3p392 [x86_64-linux] ):

                              user     system      total        real
array.sort.reverse        1.330000   0.000000   1.330000 (  1.334667)
array.sort.reverse!       1.200000   0.000000   1.200000 (  1.198232)
array.sort!.reverse!      1.200000   0.000000   1.200000 (  1.199296)
array.sort{|x,y| y<=>x}   5.220000   0.000000   5.220000 (  5.239487)

Run times are pretty constant for multiple executions of the benchmark script.

array.sort.reverse (with or without ! ) is way faster than array.sort{|x,y| y<=>x} array.sort{|x,y| y<=>x} . Thus, I recommend that.


Here is the script as a Reference:

#!/usr/bin/env ruby
require 'benchmark'

Benchmark.bm do|b|
  master = (1..1_000_000).map(&:to_s).shuffle
  a = master.dup
  b.report("array.sort.reverse      ") do
    a.sort.reverse
  end

  a = master.dup
  b.report("array.sort.reverse!     ") do
    a.sort.reverse!
  end

  a = master.dup
  b.report("array.sort!.reverse!    ") do
    a.sort!.reverse!
  end

  a = master.dup
  b.report("array.sort{|x,y| y<=>x} ") do
    a.sort{|x,y| y<=>x}
  end
end

There really is no difference here. Both methods return a new array.

For the purposes of this example, simpler is better. I would recommend array.sort.reverse because it is much more readable than the alternative. Passing blocks to methods like sort should be saved for arrays of more complex data structures and user-defined classes.

Edit: While destructive methods (anything ending in a !) are good for performance games, it was pointed out that they aren't required to return an updated array, or anything at all for that matter. It is important to keep this in mind because array.sort.reverse! could very likely return nil . If you wish to use a destructive method on a newly generated array, you should prefer calling .reverse! on a separate line instead of having a one-liner.

Example:

array = array.sort
array.reverse!

should be preferred to

array = array.sort.reverse!

Reverse! is Faster

There's often no substitute for benchmarking. While it probably makes no difference in shorter scripts, the #reverse! method is significantly faster than sorting using the "spaceship" operator. For example, on MRI Ruby 2.0, and given the following benchmark code:

require 'benchmark'

array = ["one", "two", "three"]
loops = 1_000_000

Benchmark.bmbm do |bm|
    bm.report('reverse!')  { loops.times {array.sort.reverse!} }
    bm.report('spaceship') { loops.times {array.sort {|x,y| y<=>x} }}
end

the system reports that #reverse! is almost twice as fast as using the combined comparison operator.

                user     system      total        real
reverse!    0.340000   0.000000   0.340000 (  0.344198)
spaceship   0.590000   0.010000   0.600000 (  0.595747)

My advice: use whichever is more semantically meaningful in a given context, unless you're running in a tight loop.

With comparison as simple as your example, there is not much difference, but as the formula for comparison gets complicated, it is better to avoid using <=> with a block because the block you pass will be evaluated for each element of the array, causing redundancy. Consider this:

array.sort{|x, y| some_expensive_method(x) <=> some_expensive_method(y)}

In this case, some_expensive_method will be evaluated for each possible pair of element of array .

In your particular case, use of a block with <=> can be avoided with reverse .

array.sort_by{|x| some_expensive_method(x)}.reverse

This is called Schwartzian transform.

In playing with tessi's benchmarks on my machine, I've gotten some interesting results. I'm running ruby 2.0.0p195 [x86_64-darwin12.3.0] , ie, latest release of Ruby 2 on an OS X system. I used bmbm rather than bm from the Benchmark module. My timings are:

Rehearsal -------------------------------------------------------------
array.sort.reverse:         1.010000   0.000000   1.010000 (  1.020397)
array.sort.reverse!:        0.810000   0.000000   0.810000 (  0.808368)
array.sort!.reverse!:       0.800000   0.010000   0.810000 (  0.809666)
array.sort{|x,y| y<=>x}:    0.300000   0.000000   0.300000 (  0.291002)
array.sort!{|x,y| y<=>x}:   0.100000   0.000000   0.100000 (  0.105345)
---------------------------------------------------- total: 3.030000sec

                                user     system      total        real
array.sort.reverse:         0.210000   0.000000   0.210000 (  0.208378)
array.sort.reverse!:        0.030000   0.000000   0.030000 (  0.027746)
array.sort!.reverse!:       0.020000   0.000000   0.020000 (  0.020082)
array.sort{|x,y| y<=>x}:    0.110000   0.000000   0.110000 (  0.107065)
array.sort!{|x,y| y<=>x}:   0.110000   0.000000   0.110000 (  0.105359)

First, note that in the Rehearsal phase that sort! using a comparison block comes in as the clear winner. Matz must have tuned the heck out of it in Ruby 2!

The other thing that I found exceedingly weird was how much improvement array.sort.reverse! and array.sort!.reverse! exhibited in the production pass. It was so extreme it made me wonder whether I had somehow screwed up and passed these already sorted data, so I added explicit checks for sorted or reverse-sorted data prior to performing each benchmark.


My variant of tessi's script follows:

#!/usr/bin/env ruby
require 'benchmark'

class Array
  def sorted?
    (1...length).each {|i| return false if self[i] < self[i-1] }
    true
  end

  def reversed?
    (1...length).each {|i| return false if self[i] > self[i-1] }
    true
  end
end

master = (1..1_000_000).map(&:to_s).shuffle

Benchmark.bmbm(25) do|b|
  a = master.dup
  puts "uh-oh!" if a.sorted?
  puts "oh-uh!" if a.reversed?
  b.report("array.sort.reverse:") { a.sort.reverse }

  a = master.dup
  puts "uh-oh!" if a.sorted?
  puts "oh-uh!" if a.reversed?
  b.report("array.sort.reverse!:") { a.sort.reverse! }

  a = master.dup
  puts "uh-oh!" if a.sorted?
  puts "oh-uh!" if a.reversed?
  b.report("array.sort!.reverse!:") { a.sort!.reverse! }

  a = master.dup
  puts "uh-oh!" if a.sorted?
  puts "oh-uh!" if a.reversed?
  b.report("array.sort{|x,y| y<=>x}:") { a.sort{|x,y| y<=>x} }

  a = master.dup
  puts "uh-oh!" if a.sorted?
  puts "oh-uh!" if a.reversed?
  b.report("array.sort!{|x,y| y<=>x}:") { a.sort!{|x,y| y<=>x} }
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM