简体   繁体   中英

How do I loop through a string with an array of strings to find matches?

I am trying to loop through a title string with an array of strings and see which ones from the array match.

My code works fine but I am not sure if it is the most efficient way to do this.

The important thing is that the strings in the array do not have to match a phrase in the title exactly. They can be in any order as long as every word is in the title. Any help would be great.

EX.title = "Apple Iphone 4 Verizon"
   array = ["iphone apple, verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]

I need it to return ["iphone apple", "verizon iphone", "iphone 4"] . The words in the strings "verizon iphone" and "iphone apple" are in the title, the order does not matter

results = [] 

#Loop through all the pids to see if they are found in the title
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t      iphone"]
title = "Apple Iphone 4 White Verizon"
all_pids.each do |pid|
    match = []
    split_id = pid.downcase.split(' ')
    split_id.each do |name|

      in_title = title.downcase.include?(name) 
      if in_title == true
        match << name
      end
    end

    final = match.join(" ")

    if final.strip == pid.strip
      results << pid
    end

end

print results

When I run this it prints what I need ["iphone white 4", "iphone verizon"]

It looks to me that you want to find the strings that are composed of strings that strictly intersect the strings in the title.

Array#- performs set difference operations. [2] - [1,2,3] = [] and [1,2,3] - [2] = [1,3]

title = "Apple Iphone 4 White Verizon"
all_pids = ["iphone 3gs", "iphone white 4", "iphone verizon", "black iphone", "at&t      iphone"]
set_of_strings_in_title = title.downcase.split
all_pids.find_all do |pid|
  set_of_strings_not_in_title = pid.downcase.split - set_of_strings_in_title 
  set_of_strings_not_in_title.empty?
end

EDIT: Changed #find to #find_all to return all matches, not just the first.

You could do something like the following:

>> require 'set'
=> true
>> title = "Apple Iphone 4 Verizon"
=> "Apple Iphone 4 Verizon"
>> all_pids = ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
=> ["iphone apple", "verizon iphone", "iphone 3g", "iphone 4", "cool iphone"]
>> title_set = Set.new(title.downcase.split)
=> #<Set: {"apple", "iphone", "4", "verizon"}>
>> all_pids.select { |pid| Set.new(pid.downcase.split).subset? title_set }
=> ["iphone apple", "verizon iphone", "iphone 4"]

You can do something very similar with array differences, but sets might be faster since they are implemented as hashes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM