简体   繁体   中英

Regex to match exact word in string

I've looked around but haven't been able to find a working solution to my problem.

I have an array of two strings input and want to test which element of the array contains an exact substring Test .

One thing I have tried (among numerous other attempts):

input = ["Test's string", "Test string"]
# Alternative input array that it needs to work on:
#  ["Testing string", "some Test string"]
substring = "Test"
if (input[0].match(/\b#{substring}\b/))
  puts "Test 0 "
  # Do something...
elsif (input[1].match(/\b#{substring}\b/))
  puts "Test 1"
  # Do something different...
end

The desired result is a print of "Test 1" . The input can be more complex but overall I am looking for a way to find an exact match of a substring in a longer string. I feel like this should be a rather trivial regex but I haven't been able to come up with the correct pattern. Any help would be greatly appreciated!

Following code may be what you are looking for.

input = ["Testing string", "Test string"]
substring = "Test"

if (input[0].match(/[^|\s]#{substring}[\s|$]/)
  puts "Test 0 "
elsif (input[1].match(/[^|\s]#{substring}[\s|$]/)
  puts "Test 1"
end

The meaning of the pattern /[^|\\s]#{substring}[\\s|$]/ is

  1. [^|\\s] : left side of the substring is begining of string(^) or white space,

  2. {substring} : subsring is matched exactly,

  3. [\\s|$] : right side of the substring is white space or end of string($).

One way to that is as follows:

input = ["Testing string", "Test"]

"Test #{ input.index { |s| s[/\bTest\b/] } }"
  #=> "Test 1"

input = ["Test", "Testing string"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
  #=> "Test 0"

\\b is the regex denotes a word boundary.

Maybe you want a method to return the index of the first element of input that contains the word? That could be:

def matching_index(input, word)
  input.index { |s| s[/\b#{word}\b/i] }
end

input = ["Testing string", "Test"]   
matching_index(input, "Test")    #=> 1
matching_index(input, "test")    #=> 1
matching_index(input, "Testing") #=> 0
matching_index(input, "Testy")   #=> nil

Then you could use it like this, for example:

word = 'Test'
puts "The matching element for '#{word}' is at index #{ matching_index(input, word) }"
  #=> The matching element for 'Test' is at index 1

word = "Testing"
puts "The matching element for '#{word}' is '#{ input[matching_index(input, word)] }'"
  #The matching element for 'Testing' is 'Testing string'

The problem is with your bounding. In your original question, the word Test will match the first string because the ' is will match the \\b word boundary. It's a perfect match and is responding with "Test 0" correctly. You need to determine how you'll terminate your search. If your input contains special characters, I don't think the regex will work properly. /\\bTest my $money.*/ will never match because the of the $ in your substring.

What happens if you have multiple matches in your input array? Do you want to do something to all of them or just the first one?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM