简体   繁体   中英

Is there a way to get a regex to match something and fail resulting in it continuing to match from the end of the failure point?

I am attempting to match a string that contains something in C++. Thing is, the item I'm looking for is something that occurs frequently between strings that exist on the same line (I'm looking for && ). Because of this, it is causing an undue number of false positives.

I was thinking, would it be possible to find a string, say that the string doesn't contain the looked for item and discard it, continuing from the place it ended the string from. I'm using the Visual Studio's find function so, the regex is a .NET implementation.

I've tried the following regex, but it still matches between strings on the same line (with the whitespaces removed obviously):

(?>
    (?>")
    (?>[^"\r\n"\\&]|\\.)*
    (?>
        (?<AMP>&&)
        (?>[^"\r\n"\\&]|\\.)*
        (?>")
    )
)
(?(AMP)|(?!))

Also tried this assuming that variable negative lookbehind might be implemented:

(?>
    (?>
        (?<!
            (?>
                (?<-STR>"[^"\r\n]*)
                (?<STR>"[^"\r\n]*)
            )
        )(?(STR)(?!))"
    )
    (?>[^"\r\n"\\&]|\\.)*
    (?>
        (?<AMP>&&)
        (?>[^"\r\n"\\&]|\\.)*
        (?>")
    )
)
(?(AMP)|(?!))

Neither worked. Any other possibilities or is it just beyond the capabilities of a .NET regex?

The following should match:

if (strcmp("hello && goodbye", var) == 0)

but this should not:

if (strcmp("hello", var) == 0 && strcmp("goodbye", var) == 0)

I had given up and went to perl style regexes using the command line:

grep -RnP --color --include="*.h" --include="*.cpp" '_regex_here_'

with the regex being equivalent to the following with the whitespaces removed:

(?>
    (?>")                               (?# match first ")
    (?>[^"\r\n"\\&]|\\.)*               (?# match till eol, " or &)
    (?>
            (?>
                (?>
                    (?<AMP>\&\&)        (?# found &&, store in capture group AMP)
                |
                    (?>\&)(?!\&)
                )+                      (?# match single or double &&)
                (?>[^"\r\n"\\&]|\\.)*   (?# match till eol, " or &)
            )+                          (?# one or more)
            (?>")                       (?# match last ")
        |
            (?>[^"\r\n"\\&]|\\.)*       (?# match till eol, " or &)
            (?>")                       (?# match last ")
            (*SKIP)                     (?# when fail, start from here)
            (*FAIL)                     (?# fail)
    )
)
(?(<AMP>)|(?*FAIL))                     (?# if AMP capture group was not set, then fail)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM