简体   繁体   中英

Regular expression that avoids punctuation or any unwanted characters between two search words

Here is the text that I am using as an example.

" the dogs went to the house. The Dogs, went to the house. The Dog went to the house-wife."

I want to use regular expression to get the string starting from "dog" and ending with "house". I do not want the second of third lines as they both have punctuation. I do want to pick up dogs and houses.

The regex that I am came up with is:

/(D|d)og.[^\p{P}|s]{0,40}house.{0,1}(\s|\.)/

However, it does not seem to work. Here is the error I get:

Error: Parse error on line 4:
... [

        "1,10,0,1,/(C|c)limb
---------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[', ']', got 'undefined'
validated by jsonlint

I am in economics, not computer programming so please go easy on me. Let me know if I am missing anything or need to provide additional information. Thank you.

If you want to allow only word characters and whitespace avoiding punctuation, you can do:

/dogs?[\w\s]+houses?[\s.]/i

Explanation :

dog         #  'dog'
 s?         #  's' (optional)
 [\w\s]+    #  any character of: 
            #    word characters (a-z, A-Z, 0-9, _), 
            #    whitespace (\n, \r, \t, \f, and " ") (1 or more times)
house       #  'house'
 s?         #  's' (optional)
 [\s.]      #  any character of: whitespace (\n, \r, \t, \f, and " "), '.'

Live Demo

If you don't want the ending punctuation or whitespace being included, place a capturing group around the matched pattern that you want as your match result:

/(dogs?[\w\s]+houses?)[\s.]/i

Or use a lookahead to assert either one is at that position in the string.

/dogs?[\w\s]+houses?(?=[\s.])/i

Note : Added the i modifier for case-insensitive matching as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM