简体   繁体   中英

Regex match lines with no more than a certain number of specific characters?

This is my regex so far (assume PHP flavour):

^(([^\\\\]+)\\\\([^\\\\]+)){1,4}$

And my test data:

U:\16. New Products\#Complete\Bottle Openers\20170210 St Patrick Bottle Openers\Small Lifestyles
U:\16. New Products\#Complete\Canvas
U:\16. New Products

The goal is to find all lines with no more than 4 slashes. In this example I expect to match the second and third lines, however when I test that in regex101 it seems to match over multiple lines, despite having multiline set and using ^ and $ . What am I doing wrong?

The [^\\\\] pattern is a negated character class that matches any char but a \\ char, and thus, it can match line breaks. To quickly fix the issue, you might add \\n (and perhaps, \\r ) to the negated character class and use

^(([^\\\n\r]+)\\([^\\\n\r]+)){1,4}$

See the regex demo . The [^\\\\\\n\\r] cannot match CR and LF symbols and matches any char but a \\ , LF and CR chars.

A better regex for this task would be

^[^\\\n\r]*(?:\\[^\\\n\r]*){0,4}$

Or, with the last quantified part set to possessive to enhance efficiency:

^[^\\\n\r]*(?:\\[^\\\n\r]*){0,4}+$

See this regex demo .

Details

  • ^ - start of string
  • [^\\\\\\n\\r]* - zero or more chars other than \\ , LF and CR
  • (?:\\\\[^\\\\\\n\\r]*){0,4} - 0 to 4 occurrences of
    • \\\\ - a \\ char
    • [^\\\\\\n\\r]* - zero or more chars other than \\ , LF and CR
  • $ - end of string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM