简体   繁体   中英

Find most common string in an array

I have this array, for example (the size is variable):

   x = ["1.111", "1.122", "1.250", "1.111"]

and I need to find the most commom value ( "1.111" in this case).

Is there an easy way to do that?

Tks in advance!


EDIT #1: Thank you all for the answers!


EDIT #2: I've changed my accepted answer based on ZED's information. Thank you all again!

Ruby < 2.2

#!/usr/bin/ruby1.8

def most_common_value(a)
  a.group_by do |e|
    e
  end.values.max_by(&:size).first
end

x = ["1.111", "1.122", "1.250", "1.111"]
p most_common_value(x)    # => "1.111"

Note: Enumberable.max_by is new with Ruby 1.9, but it has been backported to 1.8.7

Ruby >= 2.2

Ruby 2.2 introduces the Object#itself method, with which we can make the code more concise:

def most_common_value(a)
  a.group_by(&:itself).values.max_by(&:size).first
end

As a monkey patch

Or as Enumerable#mode :

Enumerable.class_eval do
  def mode
    group_by do |e|
      e
    end.values.max_by(&:size).first
  end
end

["1.111", "1.122", "1.250", "1.111"].mode
# => "1.111"

One pass through the hash to accumulate the counts. Use .max() to find the hash entry with the largest value.

#!/usr/bin/ruby

a = Hash.new(0)
["1.111", "1.122", "1.250", "1.111"].each { |num|
  a[num] += 1
}

a.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]

or, roll it all into one line:

ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]

If you only want the item back add .first():

ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first # => "1.111"

The first sample I used is how it would be done in Perl usually. The second is more Ruby-ish. Both work with older versions of Ruby. I wanted to compare them, plus see how Wayne's solution would speed things up so I tested with benchmark:

#!/usr/bin/env ruby

require 'benchmark'

ary = ["1.111", "1.122", "1.250", "1.111"] * 1000 

def most_common_value(a)
  a.group_by { |e| e }.values.max_by { |values| values.size }.first
end

n = 1000
Benchmark.bm(20) do |x|
  x.report("Hash.new(0)") do
    n.times do 
      a = Hash.new(0)
      ary.each { |num| a[num] += 1 }
      a.max{ |a,b| a[1] <=> b[1] }.first
    end 
  end

  x.report("inject:") do
    n.times do
      ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first
    end
  end

  x.report("most_common_value():") do
    n.times do
      most_common_value(ary)
    end
  end
end

Here's the results:

user     system      total        real
Hash.new(0)           2.150000   0.000000   2.150000 (  2.164180)
inject:               2.440000   0.010000   2.450000 (  2.451466)
most_common_value():  1.080000   0.000000   1.080000 (  1.089784)

You could sort the array and then loop over it once. In the loop just keep track of the current item and the number of times it is seen. Once the list ends or the item changes, set max_count == count if count > max_count . And of course keep track of which item has the max_count .

You could create a hashmap that stores the array items as keys with their values being the number of times that element appears in the array.

Pseudo Code:

["1.111", "1.122", "1.250", "1.111"].each { |num|
  count=your_hash_map.get(num)
  if(item==nil)
    hashmap.put(num,1)
  else
    hashmap.put(num,count+1)
}

As already mentioned, sorting might be faster.

Using the default value feature of hashes:

>> x = ["1.111", "1.122", "1.250", "1.111"]
>> h = Hash.new(0)
>> x.each{|i| h[i] += 1 }
>> h.max{|a,b| a[1] <=> b[1] }
["1.111", 2]

It will return most popular value in array

x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]

IE:

x = ["1.111", "1.122", "1.250", "1.111"]
# Most popular
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
#=> "1.111
# How many times
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[1].size
#=> 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM