I've encountered the following token in a regular expression: [\s\S]*?
If I understand this correctly, the character class means "match a whitespace character or a non-whitespace character". Therefore, would this not do exactly the same thing as .*?
One possible difference is that usually .
does not match newlines. However, this regular expression was written in Ruby and was passed the m
modifier meaning that the .
does, in fact, match newlines.
Is there any other reason to use [\s\S]*?
instead of .*?
In case it helps, the regular expression I am looking at appears inside the sprockets library in the HEADER_PATTERN constant on line 97. The full expression is:
/
\A \s* (
(\/\* ([\s\S]*?) \*\/) |
(\#\#\# ([\s\S]*?) \#\#\#) |
(\/\/ ([^\n]*) \n?)+ |
(\# ([^\n]*) \n?)+
)
/mx
You interpreted the regex correctly.
That seems like a relict from other languages which do not support the m-flag (or s-flag in other implementations).
A reason to use that construct would be to not use the m-flag so you have the possibility to use. without matching newlines but are still able to match everything if need be.
With the m flag, they would be the same except that .*
would be a lot clearer and easier to maintain.
The newline thing is the only difference. Maybe somebody thought it was easier to read without having to know the m context, or wanted it to be robust against a change to that context.
I have seen [^]*
used for a similar purpose.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.