简体   繁体   中英

Regular expression to match optional patterns

I know that Regex is a pretty hot topic and that there's a plethora of similar questions, however, I have not found one which matches my needs.

I need to check the formatting of my string to be as follows:

  • All line must start with 5 digits.
  • Characters 6 to 12 must be white space.
  • Character 13 must be either white space or asterisk.
  • if there is any period, colon or semicolon before the final period, the character must not be preceded by a white space, but it must be followed by a white space.
  • opening parentheses cannot be followed by a white space.
  • closing parentheses cannot be preceded by a white space.

I haven't tried to implement the colon, semicolon or parentheses, but so far I'm stuck at just the period. These characters are optional so I can't make a hard check, and I'm trying to catch them but I'm still getting a match in a case like

00000      *TEST .FINAL STATEMENT. //Matches, but it shouldn't match.
00001      *TEST2 . FINAL STATEMENT. //Matches, but it shouldn't match.
00002      *TEST3. FINAL STATEMENT. //Matches, **should** match.

This is the regex I have so far:

^\d{5}\s{6}[\s\*][^.]*([^.\s]+\.\s)?[^.]*\..*$

I really don't see how this is happening, especially because I'm using [^.] to indicate I'll accept anything except a period as a wildcard, and the optional pattern looks correct at a glance: If there's a period, it should not have white space behind it and it should have white space after it.

Try this:

^\d{5}\s{6}[\s\*]   # Your original pattern
(?:                 # Repeat 0 or more times:
  [^.:;()]*|        # Unconstrained characters
  (?<!\s)[.:;](?=\s)|    # Punctuation after non-space, followed by space
  \((?!\s)|         # Opening parentheses not followed by space
  (?<!\s)\)         # Closing parentheses not preceeded by space
)*
\.$                 # Period, then end of string

https://regex101.com/r/WwpssV/1

In the last part of the pattern, the characters with special requirements are .:;() , so use a negative character set to match anything but those characters: [^.:;()]* Then alternate with:

if there is any period, colon or semicolon before the final period, the character must not be preceded by a white space, but it must be followed by a white space.

Fulfilled by (?<!\\s)[.:;](?=\\s) - match one of those characters only if not preceded by a space, and if followed by a space.

opening parentheses cannot be followed by a white space.

Fulfilled by \\((?!\\s)

closing parentheses cannot be preceded by a white space.

Fulfilled by (?<!\\s)\\)

Then just alternate between those 4 possibilities at the end of the pattern.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM