How to compare similar related keys in array of hashes in Ruby?

Question

I have an array of hashes, let's say

arr = [
  {:country=>"Portugal", :rating=>"Great", :rating2=>"Average", :city1=>"Lisbon", :city2=>"Porto"},
  {:country=>"USA", :rating=>"Good", :rating2=>"Great", :city1=>"NYC", :city2=>"Birmingham"},
  {:country=>"UK", :rating=>"Dismal", :rating2=>"Poor", :city1=>"Birmingham", :city2=>"London"}
]

I'm trying to check through this array for key:value matches between hashes based on the linked columns ( rating1 / rating2 , city1 / city2 ) and give the matches an id as a new key:value pair. I also trying to search for multiple search terms.

Eg if searching for "city", it should return:

{:country=>"Portugal", :rating=>"Great", :rating2=>"Average", :city1=>"Lisbon", :city2=>"Porto", :match=>"0"},

{:country=>"USA", :rating=>"Good", :rating2=>"Great", :city1=>"NYC", :city2=>"Birmingham", :match=>"1"},

{:country=>"UK", :rating=>"Dismal", :rating2=>"Poor", :city1=>"Birmingham", :city2=>"London", :match=>"1"}

If searching for "city" and "rating" together, it should return:

{:country=>"Portugal", :rating=>"Great", :rating2=>"Average", :city1=>"Lisbon", :city2=>"Porto", :match=>"1"},

{:country=>"USA", :rating=>"Good", :rating2=>"Great", :city1=>"NYC", :city2=>"Birmingham", :match=>"1"},

{:country=>"UK", :rating=>"Dismal", :rating2=>"Poor", :city1=>"Birmingham", :city2=>"London", :match=>"1"}

(Because Portugal's rating1 matches with USAs rating2 , and USA's city2 matches with UK's city1 )

So far I can match on a single column and with a single attribute, but I'm having trouble when trying to work with the similar columns and multiple attributes:

def search_id(elements, search_key)
  # Get rid of elements with nil values.
  target_elements = elements.reject {|e| e[search_key].nil?}

  result = []
  id = 0

  # Iterate through target_elements
  target_elements.each do |currentElement|

    # Check for elements with same value in the result array
    match = result.find {|previousElement| currentElement[search_key] == previousElement[search_key]}

    if match
      currentElement[:id] = match[:id]
    else
      currentElement[:id] = id
      id += 1
    end

    # Add element to result
    result << currentElement
  end

  result
end

Answer 1

I have assumed that the keys :rating should be :rating1 .

Based on my understanding of the question, you could obtain the desired return values by executing add_counts below.

def add_counts(arr, *key_matches)
  arr.map { |h| h.merge(match: count_matches(arr, key_matches, h).to_s) }
end

def count_matches(arr, key_matches, h)
  arr.count { |g| match_either_pair?(key_matches, h, g) } - 1
end

def match_either_pair?(key_matches, h, g)
  key_matches.any? { |key_pair| match_either_key?(h, g, key_pair) }
end

def match_either_key?(h, g, key_pair)
  (h.values_at(*key_pair) & g.values_at(*key_pair)).any?
end

Let's try it.

arr = [
  {:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
   :city1=>"Lisbon", :city2=>"Porto"},
  {:country=>"USA", :rating1=>"Good", :rating2=>"Great", :city1=>"NYC",
   :city2=>"Birmingham"},
  {:country=>"UK", :rating1=>"Dismal", :rating2=>"Poor",
   :city1=>"Birmingham", :city2=>"London"}
]

add_counts(arr, [:city1, :city2])
  #=> [{:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
  #     :city1=>"Lisbon", :city2=>"Porto", :match=>"0"},
  #    {:country=>"USA", :rating1=>"Good", :rating2=>"Great",
  #     :city1=>"NYC", :city2=>"Birmingham", :match=>"1"},
  #    {:country=>"UK", :rating1=>"Dismal", :rating2=>"Poor",
  #     :city1=>"Birmingham", :city2=>"London", :match=>"1"}]

add_counts(arr, [:rating1, :rating2], [:city1, :city2])
  #=> [{:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
  #     :city1=>"Lisbon", :city2=>"Porto", :match=>"1"},
  #    {:country=>"USA", :rating1=>"Good", :rating2=>"Great",
  #     :city1=>"NYC", :city2=>"Birmingham", :match=>"2"},
  #    {:country=>"UK", :rating1=>"Dismal", :rating2=>"Poor",
  #     :city1=>"Birmingham", :city2=>"London", :match=>"1"}]

Let's examine each of the methods in reverse order. I assume arr is as above and

key_matches = [[:rating1, :rating2], [:city1, :city2]]

match_either_key?

Suppose

h = arr[0]
  #=> {:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
  #    :city1=>"Lisbon", :city2=>"Porto"}
g = arr[1]
  #=> {:country=>"USA", :rating1=>"Good", :rating2=>"Great",
  #    :city1=>"NYC", :city2=>"Birmingham"}

First,

key_pair = [:rating1, :rating2]

a = h.values_at(*key_pair)
  #=> ["Great", "Average"]
b = g.values_at(*key_pair)
  #=> ["Good", "Great"]
c = a & b
  #=> ["Great"]
c.any?
  #=> true

As c.any? => true c.any? => true there is no need to test for

key_pair = [:city1, :city2]

but had we done so,

a = h.values_at(*key_pair)
  #=> ["Lisbon", "Porto"]
b = g.values_at(*key_pair)
  #=> ["NYC", "Birmingham"]
c = a & b
  #=> []
c.any?
  #=> false

match_either_pair?

As before, assume

h = arr[0]
  #=> {:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
  #    :city1=>"Lisbon", :city2=>"Porto"}
g = arr[1]
  #=> {:country=>"USA", :rating1=>"Good", :rating2=>"Great",
  #    :city1=>"NYC", :city2=>"Birmingham"}

then

key_pair = key_matches[0]
  #=> [:rating1, :rating2]
match_either_key?(h, g, key_pair)
  #=> true

The latter was computed above. Since it was found to return true there is no need to compute the value for key_matches[1] .

count_matches

Again, we assume

h = arr[0]
  #=> {:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
  #    :city1=>"Lisbon", :city2=>"Porto"}

The first value passed to the block is h :

g = arr[0]

As h == g #=> true both pairs obviously match:

match_either_pair?(key_matches, h, g)
  #=> true

Next

g = arr[1]
  #=> {:country=>"USA", :rating1=>"Good", :rating2=>"Great",
  #    :city1=>"NYC", :city2=>"Birmingham"}

and, as previously determined,

match_either_pair?(key_matches, h, g)
  #=> true

Lastly,

g = arr[2]
  #=> {:country=>"UK", :rating1=>"Dismal", :rating2=>"Poor",
  #    :city1=>"Birmingham", :city2=>"London"}
match_either_pair?(key_matches, h, g)
  #=> false

(Trust me on the last calculation.)

As arr[0] matches itself and arr[1] , count returns 2 .

We subtract 1 to obtain the number of matches with other elements of arr .

add_counts

h = arr[0]
n = count_matches(arr, key_matches, h)
  #=> 1
s = n.to_s
  #=> "1"
h.merge(match: s)
  #=> {:country=>"Portugal", :rating1=>"Great", :rating2=>"Average",
  #    :city1=>"Lisbon", :city2=>"Porto", :match=>"1"}

h = arr[1]
n = count_matches(arr, key_matches, h)
  #=> 2
s = n.to_s
  #=> "2"
h.merge(match: s)
  #=> {:country=>"USA", :rating1=>"Good", :rating2=>"Great",
  #    :city1=>"NYC", :city2=>"Birmingham", :match=>"2"}

h = arr[2]
n = count_matches(arr, key_matches, h)
  #=> 1
s = n.to_s
  #=> "1"
h.merge(match: s)
  #=> {:country=>"UK", :rating1=>"Dismal", :rating2=>"Poor",
  #    :city1=>"Birmingham", :city2=>"London", :match=>"1"}

If the key :match is to be interpreted as a boolean, where '1' and '0' represent true and false respectively, change add_counts to the following.

def add_if_match(arr, *key_matches)
  arr.map { |h| h.merge(match: [count_matches(arr, key_matches, h), 1].min.to_s) }
end

How to compare similar related keys in array of hashes in Ruby?

Question

1 answers

solution1
0 2022-05-29 22:01:44

How to compare similar related keys in array of hashes in Ruby?

Question

1 answers

solution1 0 2022-05-29 22:01:44

solution1
0 2022-05-29 22:01:44