Here is the text that I am using as an example.
" the dogs went to the house. The Dogs, went to the house. The Dog went to the house-wife."
I want to use regular expression to get the string starting from "dog" and ending with "house". I do not want the second of third lines as they both have punctuation. I do want to pick up dogs and houses.
The regex that I am came up with is:
/(D|d)og.[^\p{P}|s]{0,40}house.{0,1}(\s|\.)/
However, it does not seem to work. Here is the error I get:
Error: Parse error on line 4:
... [
"1,10,0,1,/(C|c)limb
---------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[', ']', got 'undefined'
validated by jsonlint
I am in economics, not computer programming so please go easy on me. Let me know if I am missing anything or need to provide additional information. Thank you.
If you want to allow only word characters and whitespace avoiding punctuation, you can do:
/dogs?[\w\s]+houses?[\s.]/i
Explanation :
dog # 'dog'
s? # 's' (optional)
[\w\s]+ # any character of:
# word characters (a-z, A-Z, 0-9, _),
# whitespace (\n, \r, \t, \f, and " ") (1 or more times)
house # 'house'
s? # 's' (optional)
[\s.] # any character of: whitespace (\n, \r, \t, \f, and " "), '.'
If you don't want the ending punctuation or whitespace being included, place a capturing group around the matched pattern that you want as your match result:
/(dogs?[\w\s]+houses?)[\s.]/i
Or use a lookahead to assert either one is at that position in the string.
/dogs?[\w\s]+houses?(?=[\s.])/i
Note : Added the i
modifier for case-insensitive matching as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.