简体   繁体   中英

sorting and rearranging an array of hashes based on multiple conditions

I'm trying to sort an array based on 3 different criteria. Let's say I have an array of hashes like this:

a = [
    { "name" => "X", "year" => "2013-08"},
    { "name" => "A", "year" => "2017-01"},
    { "name" => "X", "year" => "2000-08"},
    { "name" => "B", "year" => "2018-05"},
    { "name" => "D", "year" => "2016-04"},
    { "name" => "C", "year" => "2016-04"}
]

I would like to sort all elements first by "year" in descending order, then by "name" in ascending order, then move all elements matching a given name to the beginning of the array, while still respecting the "year" order. For this example, let's say I'm looking for elements with a "name" value of "X". So the output I'm looking for would be:

{"name"=>"X", "year"=>"2013-08"}
{"name"=>"X", "year"=>"2000-08"}
{"name"=>"B", "year"=>"2018-05"}
{"name"=>"A", "year"=>"2017-01"}
{"name"=>"C", "year"=>"2016-04"}
{"name"=>"D", "year"=>"2016-04"}

So everything is in descending order of "year", then ascending order of "name", then all hashes where "name" == "X" moved to the top, still sorted by "year".

I took care of the ascending/descending sorting by doing this:

a.sort { |a,b| [b["year"], a["name"]] <=> [a["year"], b["name"]] }        

But this only handles the first 2 criteria of what I need. I tried something like this afterward:

top = []
a.each { |x| top << x if x["name"] == "X" }
a.delete_if { |x| x["name"] == "X"}
a.unshift(top)

which does produce the desired output, but is obviously clunky and doesn't seem like the best way to be doing things. Is there a faster, more efficient way to do what I'm trying to do?

(FYI, the year values are strings and I can't convert them into integers. I simplified the values here, but the data I'm pulling from actually appends a series of other characters and symbols at the end of each value.)

sort is not the thing you want to use if you have consistent sort criteria. The faster method is sort_by :

a.sort_by { |e| [ e["year"], e["name"] ] }

Since you want them in reverse order:

a.sort_by { |e| [ e["year"], e["name"] ] }.reverse

Where in effect that sorts each element in the array based on the converted form expressed in the block, then sorts based on those instead. This conversion is done once and once only, and this is a lot less messy than the sort method which must perform that conversion each time a comparison is made.

Now if you want to sort the "X" entries to the top you can easily add that as an additional criteria:

a.sort_by { |e| [ e["name"] == "X" ? 1 : 0, e["year"], e["name"] ] }.reverse

So that gets you to where you want to be.

The nice thing about sort_by is you can usually express really complicated sorting logic as a series of elements in an array. Provided each element is Comparable it all works out.

arr = [
  {"name"=>"X", "year"=>"2013-08"},
  {"name"=>"X", "year"=>"2000-08"},
  {"name"=>"B", "year"=>"2018-05"},
  {"name"=>"A", "year"=>"2017-01"},
  {"name"=>"C", "year"=>"2016-04"},
  {"name"=>"D", "year"=>"2016-04"},
]

When parts of an array are to be sorted differently than other parts of the array, I find it beneficial to partition the array into the associated parts, sort each part separately, then combine the results of those sorts. Not only is this approach generally easy for readers to follow, but it simplifies testing and tends to be at least as efficient as performing a single, more complex sort. Here, we would partition the array into two parts.

x, non_x = arr.partition { |h| h["name"] == 'X' }
  #=> [[{"name"=>"X", "year"=>"2013-08"}, {"name"=>"X", "year"=>"2000-08"}],
  #    [{"name"=>"B", "year"=>"2018-05"}, {"name"=>"A", "year"=>"2017-01"},
  #     {"name"=>"C", "year"=>"2016-04"}, {"name"=>"D", "year"=>"2016-04"}]]

Sorting the array x is easy.

sorted_x = x.sort_by { |h| h["year"] }.reverse
  #=> [{"name"=>"X", "year"=>"2013-08"}, {"name"=>"X", "year"=>"2000-08"}]

Sorting non_x is a more complex because it is to be sorted by decreasing order of the values of "year" , with ties to be broken by the values of "name" , in increasing order. In this situation we can always use Array#sort .

non_x.sort do |g,h|
  case g["year"] <=> h["year"]
  when -1
    1
  when 1
    -1
  when 0
    (g["name"] < h["name"]) ? -1 : 1
  end
end
  #=> [{"name"=>"B", "year"=>"2018-05"}, {"name"=>"A", "year"=>"2017-01"},
  #    {"name"=>"C", "year"=>"2016-04"}, {"name"=>"D", "year"=>"2016-04"}]

With a bit of effort we could alternatively use Enumerable#sort_by . Given a hash h , we would need to sort on either

[h["year"], f(h["name"])].reverse

where f is a method that causes h["name"] to be sorted in decreasing order, or (note no .reverse in the following)

[f(h["year"]), h["name"]]

where f is a method that causes h["year"] to be sorted in decreasing order. The latter is the easier of the two to implement. We could use the following method.

def year_str_to_int(year_str)
  yr, mon = year_str.split('-').map(&:to_i)
  12 * yr + mon
end

This allows us to sort non_x as desired:

sorted_non_x = non_x.sort_by { |h| [-year_str_to_int(h["year"]), h["name"]] }
  #=> [{"name"=>"B", "year"=>"2018-05"}, {"name"=>"A", "year"=>"2017-01"},
  #    {"name"=>"C", "year"=>"2016-04"}, {"name"=>"D", "year"=>"2016-04"}]

We now simply combine the two sorted partitions.

sorted_x.concat(sorted_non_x)
  #=> [{"name"=>"X", "year"=>"2013-08"}, {"name"=>"X", "year"=>"2000-08"},
  #    {"name"=>"B", "year"=>"2018-05"}, {"name"=>"A", "year"=>"2017-01"}, 
  #    {"name"=>"C", "year"=>"2016-04"}, {"name"=>"D", "year"=>"2016-04"}]

You can write your own comparator by implementing the comparable logic for your objects like so:

require 'pp'

a = [
    { "name" => "X", "year" => "2013-08"},
    { "name" => "A", "year" => "2017-01"},
    { "name" => "X", "year" => "2000-08"},
    { "name" => "B", "year" => "2018-05"},
    { "name" => "D", "year" => "2016-04"},
    { "name" => "C", "year" => "2016-04"}
]

class NameYearSorter
  attr_reader :value
  def initialize(value)
    @value = value
  end

  def name
    value['name']
  end

  def year
    value['year']
  end

  def <=>(other)
    if self.name != 'X' && other.name != 'X'
      if self.year == other.year
        self.name <=> other.name
      else
        self.year > other.year ? -1 : 0
      end
    elsif self.name == 'X' && other.name != 'X'
      -1
    elsif other.name == 'X' && self.name != 'X'
      0   
    elsif self.name == other.name
      other.year > self.year ? 0 : -1
    end
  end
end

sortable = a.map{ |v| NameYearSorter.new(v) }
pp sortable.sort.map(&:value)

# Output:
#=> [{"name"=>"X", "year"=>"2013-08"},
#=>  {"name"=>"X", "year"=>"2000-08"},
#=>  {"name"=>"B", "year"=>"2018-05"},
#=>  {"name"=>"A", "year"=>"2017-01"},
#=>  {"name"=>"C", "year"=>"2016-04"},
#=>  {"name"=>"D", "year"=>"2016-04"}]

Here is another option using what you already have as a base (Since you were basically all the way there)

a = [
  { "name" => "X", "year" => "2013-08"},
  { "name" => "A", "year" => "2017-01"},
  { "name" => "X", "year" => "2000-08"},
  { "name" => "B", "year" => "2018-05"},
  { "name" => "D", "year" => "2016-04"},
  { "name" => "C", "year" => "2016-04"}
]


a.sort do  |a,b| 
  a_ord, b_ord = [a,b].map {|e| e["name"] == "X" ? 0 : 1 }
  [a_ord,b["year"],a["name"] ] <=> [b_ord, a["year"],b["name"]]
end

Here we just make sure that "X" is always in front by assigning it a 0 and everything else a 1. Then since 0 and 0 would be equivalent X will fall back to the same logic you already have applied as will all the others. We can make this a bit fancier as:

a.sort do  |a,b| 
  [a,b].map {|e| e["name"] == "X" ? 0 : 1 }.zip(
    [b["year"],a["year"]],[a["name"],b["name"]]
  ).reduce(:<=>)
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM