I have a function that generates a random email address:
def emails
names = ["alfred", "daniel", "elisa", "ana", "ramzes"]
surnames = ["oak", "leaf", "grass", "fruit"]
providers = ["gmail", "yahoo", "outlook", "icloud"]
address = "#{names.sample}.#{surnames.sample}#{rand(100..5300)}@#{providers.sample}.com"
end
Given a list of randomly generated email address:
email_list = 100.times.map { emails }
that looks like this:
daniel.oak3985@icloud.com
ramzes.grass1166@icloud.com
daniel.fruit992@yahoo.com
...
how can I select the most common provider ("gmail", "yahoo", etc.)?
Your question is similar to this one . There's a twist though : you don't want to analyze the frequency of email addresses, but their providers.
def random_email
names = ["alfred", "daniel", "elisa", "ana", "ramzes"]
surnames = ["oak", "leaf", "grass", "fruit"]
providers = ["gmail", "yahoo", "outlook", "icloud"]
address = "#{names.sample}.#{surnames.sample}#{rand(100..5300)}@#{providers.sample}.com"
end
emails = Array.new(100){ random_email }
freq = emails.each_with_object(Hash.new(0)) do |email,freq|
provider = email.split('@').last
freq[provider] += 1
end
p freq
#=> {"outlook.com"=>24, "yahoo.com"=>28, "gmail.com"=>32, "icloud.com"=>16}
p freq.max_by{|provider, count| count}.first
#=> "gmail.com"
email_list = 10.times.map { emails }
#=> ["alfred.grass426@gmail.com", "elisa.oak239@icloud.com",
# "daniel.fruit1600@outlook.com", "ana.fruit3761@icloud.com",
# "daniel.grass742@yahoo.com", "elisa.oak3891@outlook.com",
# "alfred.leaf1321@gmail.com", "alfred.grass5295@outlook.com",
# "ramzes.fruit435@gmail.com", "ana.fruit4233@yahoo.com"]
email_list.group_by { |s| s[/@\K.+/] }.max_by { |_,v| v.size }.first
#=> "gmail.com"
\\K
in the regex means disregard everything matched so far. Alternatively, @\\K
could be replaced by the positive lookbehind (?<=@)
.
The steps are as follows.
h = email_list.group_by { |s| s[/@\K.+/] }
#=> {"gmail.com" =>["alfred.grass426@gmail.com", "alfred.leaf1321@gmail.com",
# "ramzes.fruit435@gmail.com"],
# "icloud.com" =>["elisa.oak239@icloud.com", "ana.fruit3761@icloud.com"],
# "outlook.com"=>["daniel.fruit1600@outlook.com", "elisa.oak3891@outlook.com",
# "alfred.grass5295@outlook.com"],
# "yahoo.com" =>["daniel.grass742@yahoo.com", "ana.fruit4233@yahoo.com"]}
a = h.max_by { |_,v| v.size }
#=> ["gmail.com", ["alfred.grass426@gmail.com", "alfred.leaf1321@gmail.com",
# "ramzes.fruit435@gmail.com"]]
a.first
#=> "gmail.com"
If, as here, there is a tie for most frequent, modify the code as follows to get all winners.
h = email_list.group_by { |s| s[/@\K.+/] }
# (same as above)
mx_size = h.map { |_,v| v.size }.max
#=> 3
h.select { |_,v| v.size == mx_size }.keys
#=> ["gmail.com", "outlook.com"]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.