简体   繁体   English

如何改善python中的正则表达式?

[英]How to improve that regular expression in python?

I need a regular expression that match that criteria: 我需要一个符合该条件的正则表达式:

1 member
2 members
10 members
100 members
1,000 members
10,000 members
100,000 members
100,000,000 members
999,999,999,999 members

So I did: 所以我做了:

\d+ member|
\d+ members|
\d+,\d+ members|
\d+,\d+,\d+ members|
\d+,\d+,\d+,\d+ members

You can see it Interactively here: https://regex101.com/r/oW3bJ6/2 您可以在此处以交互方式查看它: https : //regex101.com/r/oW3bJ6/2

But deep in my heart I now this is very ugly. 但是我内心深处现在我很难受。 Could you guys/girls help me find an elegant solution ? 你们可以帮我找到一个优雅的解决方案吗?

Why not just this? 为什么不只是这个呢?

\d+(?:,\d+)* members?

If you prefer to verify the digits are in groups of three: 如果您希望验证数字是否为三位一组:

\d+(?:,\d{3})* members?

(edited to add ? after s per Fredrik in the comments) (已编辑,在评论中s每位Fredrik后面加上?

\d+[,\d\s]+members?
  • \\d+ match a digit [0-9] \\ d +匹配数字[0-9]
  • [,\\d\\s]+ match a single character present in the list below , the literal character , \\d match a digit [0-9] and \\s match any white space character [\\r\\n\\t\\f ] [,\\ d \\ s] +匹配下面列表中的单个字符,文字字符\\ d匹配数字[0-9],\\ s匹配任何空白字符[\\ r \\ n \\ t \\ f]

You can also try this: 您也可以尝试以下操作:

(\d|,)+ members?

At first, (\\d|,)+ will match any decimal digit or , one or more times, then the regex will match a space, then member or members ( ? means the s can occur 0 or 1 time). 首先, (\\d|,)+将匹配任何十进制数字或,一个或更多次,那么正则表达式匹配的空间中,然后或多个成员( ?s可发生0或1次)。

This will match everything in the list: 这将匹配列表中的所有内容:

\\d+(,\\d{3})* member(s)? \\ d +(,\\ d {3})*成员?

But it will also match: 1 members 但它也将匹配:1个成员

Is that acceptable? 可以接受吗? If not, you could use: 如果没有,您可以使用:

1 member| 1个成员| \\d+(,\\d{3})* members \\ d +(,\\ d {3})*成员

I'm not sure how pedantic you need your expression to be, but your accepted answer will give you some false positives with respect to your example. 我不确定您需要多大的表情,但是您接受的答案会给您关于示例的错误肯定。 ie, the following lines, among others, will match ; 即,以下几行将匹配 whether that is acceptable is up to you: 是否可以接受取决于您:

1 members       # Plural members for '1'
5 member        # Non-plural member
1000,0 members  # Invalid comma separator
1000000 members # Missing comma separator
00000 members   # Multiple zeros (or any other number)
010 member      # Leading zeros
1, 1 member     # Invalid

The following regular expression will match the exact pattern stated in your example: 以下正则表达式将匹配示例中所述的确切模式:

^1 member|^[1-9]\d{0,2}(,\d{3})* members

^ ensures we match starting at the beginning of the line. ^确保我们从行首开始匹配。

1 member is a special, non-plural case 1 member是特殊的非复案

[1-9]\\d{0,2} matches the numbers 1-999, but not expressions with a leading 0 (such as 0 or 010) ... [1-9]\\d{0,2}匹配数字1-999,但不匹配以0开头的表达式(例如0或010)...

(,\\d{3})* followed by any number of groups of ',000-999' (,\\d{3})*然后是任意数量的'000-999'组

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM