简体   繁体   中英

How to improve that regular expression in python?

I need a regular expression that match that criteria:

1 member
2 members
10 members
100 members
1,000 members
10,000 members
100,000 members
100,000,000 members
999,999,999,999 members

So I did:

\d+ member|
\d+ members|
\d+,\d+ members|
\d+,\d+,\d+ members|
\d+,\d+,\d+,\d+ members

You can see it Interactively here: https://regex101.com/r/oW3bJ6/2

But deep in my heart I now this is very ugly. Could you guys/girls help me find an elegant solution ?

Why not just this?

\d+(?:,\d+)* members?

If you prefer to verify the digits are in groups of three:

\d+(?:,\d{3})* members?

(edited to add ? after s per Fredrik in the comments)

\d+[,\d\s]+members?
  • \\d+ match a digit [0-9]
  • [,\\d\\s]+ match a single character present in the list below , the literal character , \\d match a digit [0-9] and \\s match any white space character [\\r\\n\\t\\f ]

You can also try this:

(\d|,)+ members?

At first, (\\d|,)+ will match any decimal digit or , one or more times, then the regex will match a space, then member or members ( ? means the s can occur 0 or 1 time).

This will match everything in the list:

\\d+(,\\d{3})* member(s)?

But it will also match: 1 members

Is that acceptable? If not, you could use:

1 member| \\d+(,\\d{3})* members

I'm not sure how pedantic you need your expression to be, but your accepted answer will give you some false positives with respect to your example. ie, the following lines, among others, will match ; whether that is acceptable is up to you:

1 members       # Plural members for '1'
5 member        # Non-plural member
1000,0 members  # Invalid comma separator
1000000 members # Missing comma separator
00000 members   # Multiple zeros (or any other number)
010 member      # Leading zeros
1, 1 member     # Invalid

The following regular expression will match the exact pattern stated in your example:

^1 member|^[1-9]\d{0,2}(,\d{3})* members

^ ensures we match starting at the beginning of the line.

1 member is a special, non-plural case

[1-9]\\d{0,2} matches the numbers 1-999, but not expressions with a leading 0 (such as 0 or 010) ...

(,\\d{3})* followed by any number of groups of ',000-999'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM