简体   繁体   中英

Python regex to find symbol digit symbol

I wrote this regex in Python and tested it out on regex101, but it is still not working the way I want:

((^[-\/\\\(\)\s\,\&\.]+)?([0-9]+)([-\/\\\(\)\s\,\&\.])+)

What I am trying to find is the pattern where the string optionally starts or ends with one of these symbols, and has ONLY digits in the middle:

-/\()& .

This list includes dash, forward slash, back slash, parenthesis, ampersand, blank space, and period. A search should return true if the string contains ONLY digit is in the middle with optional punctuation at the beginning and/or end of the string.

This regex seems to work for most cases, but fails if I add a letter into the digits in the middle. It still ends up returning True. What should I do to this regex so that it only returns true for cases where there is symbol (optional), all digits, symbol (optional)?

Cases where it should return True:

  1. symbol + digits ie (9672
  2. only digits ie 20427304 or 8
  3. digits + symbol ie 345--
  4. symbol + digits + symbol ie (67-.

Case where it should NOT return True (because of the 'y' in the string):

(678983y733)..

There are a few things in your regex that need to change.

  • First of all, you have WAY too much escaping going on there, which makes it super confusing to read.

  • Secondly, You have weird stuff happening with the parenthesis. You don't need anything to completely surround the regex, because $0 already will return that.

  • Your last char class is not optional in your regex.

  • You need to surround everything with ^$ in order to ensure that the string isn't a partial match.

Here's what I came up with:

^([-/\\()\s,&.]*)([0-9]+)([-/\\()\s,&.]*)$

Note that having ([something]+)? is equal to ([something]*) , but the latter is more readable.

I think what you are looking for is re.fullmatch .

import re
ponct = '[' + re.escape('-/\()&') + ']*'
p = re.compile(ponct + '[0-9]+' + ponct)

Then p.fullmatch('(678983y733)') will return None, and all your other examples will return a match.

This allows you to find them embedded in a string, not just at the start. The ? allows zero or one symbol. Change this to * if you want zero or more leading/trailing symbols.

([-\\\/\&\.]?)\b([0-9]+)\b([-\\\/\&\.]?)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM