简体   繁体   中英

Python regex search for alphanumeric characters and forward slash

I'm trying to find if a word matches the following conditions:

  • uses only alphanumeric (and underscore) characters and optionally also a forward slash.
  • is 1 or 2 lines
  • has 1-4 characters per line

My attempt:

pattern = r"\w{1,4}(\n[^\/]\w{1,4})?"
return bool(re.fullmatch(pattern, word))

For eg it should return a match if the word matches all the conditions.

Here are some examples:

  • "EYE"
  • "EYE\n1/2"
  • "SOFT\nTISS"
  • "BLAD\n2"

The alphanumeric part works, but not the forward slash [^\/] addition. Any suggestions?

Thanks!

Problem : [^\/] matches any character different from a forward slash. \w does not match slashes.

Use

pattern = r"(?=(?:[^/]*/)?[^/]*$)[\w/]{1,4}(\n[\w/]{1,4})?"
return bool(re.fullmatch(pattern, word))

See proof .

Explanation

--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (optional
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      [^/]*                    any character except: '/' (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      /                        '/'
--------------------------------------------------------------------------------
    )?                       end of grouping
--------------------------------------------------------------------------------
    [^/]*                    any character except: '/' (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    $                        before an optional \n, and the end of
                             the string
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
  [\w/]{1,4}               any character of: word characters (a-z, A-
                           Z, 0-9, _), '/' (between 1 and 4 times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1 (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \n                       '\n' (newline)
--------------------------------------------------------------------------------
    [\w/]{1,4}               any character of: word characters (a-z,
                             A-Z, 0-9, _), '/' (between 1 and 4 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )?                       end of \1 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \1)
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

Python code :

import re
strings = ["EYE", "EYE\n1/2", "SOFT\nTISS", "BLAD\n2"]
for s in strings:
    print(bool(re.fullmatch(r'(?=(?:[^/]*/)?[^/]*$)[\w/]{1,4}(\n[\w/]{1,4})?', s)))

If you want to match / , simply use r'/'

This should match all your examples:

r'\w{1,4}(\n[\w/]{1,4})?'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM