简体   繁体   中英

How to find all the strings with length 5 and have 1 digit and 4 letters divided to all group combinations

I need regex to count all the groups of strings with length of 5 that contains 1 digit ( 0-9 ) and 4 small letters ( az ) with the following:

  • 1 digit and all letters are different
    For example: 1abcd
  • 1 digit, 2 letters are equal and the rest are different
    For example: a2acd
  • 1 digit, 3 letters are equal and the rest are different
    For example: aa3ad
  • 1 digit, 4 letters are equal
    For example: aa5aa
  • 1 digit, 2 letters are equal and two different other letters are equal
    For example: 1aabb

I know how to match all the strings with length of 5 with letters and 1 digit:
^(?=.{5}$)[az]*(?:\\d[az]*){1}$
Here is an example.

But I don't how to do it for each of the above groups.
I read that for the first example ( 1 digit and all letters are different ) I need to prevent from a repeating char with .*(.).*\\1 but I tried:

^(?=.{5}$)[a-z]*(?:\d[a-z]*)(.*(.).*\1){1}$  

It didn't work.

You can use:

/\b(?=[a-zA-Z]*\d[a-zA-Z]*)([a-zA-Z0-9]{5})/

Demo

Add a second \\b to reject matching strings longer than 5 characters:

/\b(?=[a-zA-Z]*\d[a-zA-Z]*)([a-zA-Z0-9]{5}\b)/

Demo 2

If you then want to limit to lower case letters:

/\b(?=[a-z]*\d[a-z]*)([a-z0-9]{5}\b)/

Since all combos of the four letters are possible, no further classification is necessary. All the same, all different, some the same.

If you DO want to classify the letters, just capture in Python and add the logic desired.


Based on your example (which it would be helpful to state what is and is not a match for the goal of this question):

/(?=^[a-z]*\d[a-z]*$)(^[a-z0-9]{5}$)/mg

Demo 3

Then if you want to classify into groups, I would just do that in Python:

import re 

st='''\
1aaaa
2aabb
jwzw3
jlwk6
bjkgp
5fm8s
x975t
k88q5
zl796
qm9hb
h6gtf
9rm9p'''

di={}
for m in re.finditer(r'(?=^[a-z]*\d[a-z]*$)(^[a-z0-9]{5}$)', st, re.M):
    di.setdefault(len(set(m.group(1)))-1, []).append(m.group(1))

>>> di
{1: ['1aaaa'], 2: ['2aabb'], 3: ['jwzw3'], 4: ['jlwk6', 'qm9hb', 'h6gtf']}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM