简体   繁体   中英

What is the difference between [\s\S]*? and .*?

I've encountered the following token in a regular expression: [\s\S]*?

If I understand this correctly, the character class means "match a whitespace character or a non-whitespace character". Therefore, would this not do exactly the same thing as .*?

One possible difference is that usually . does not match newlines. However, this regular expression was written in Ruby and was passed the m modifier meaning that the . does, in fact, match newlines.

Is there any other reason to use [\s\S]*? instead of .*?

In case it helps, the regular expression I am looking at appears inside the sprockets library in the HEADER_PATTERN constant on line 97. The full expression is:

/
  \A \s* (
    (\/\* ([\s\S]*?) \*\/) |
    (\#\#\# ([\s\S]*?) \#\#\#) |
    (\/\/ ([^\n]*) \n?)+ |
    (\# ([^\n]*) \n?)+
  )
/mx

You interpreted the regex correctly.

That seems like a relict from other languages which do not support the m-flag (or s-flag in other implementations).

A reason to use that construct would be to not use the m-flag so you have the possibility to use. without matching newlines but are still able to match everything if need be.

With the m flag, they would be the same except that .* would be a lot clearer and easier to maintain.

The newline thing is the only difference. Maybe somebody thought it was easier to read without having to know the m context, or wanted it to be robust against a change to that context.

I have seen [^]* used for a similar purpose.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM