简体   繁体   中英

Regex : Return a different ordering of matches in a single capturing group

Im trying to extract user identities from a smartcard, and I need to match this pattern: CN=LAST.FIRST.MIDDLE.0000000000

And have this result returned: FIRST.LAST

This would normaly be easy if I were doing this in my own code:

# python example
string = 'CN=LAST.FIRST.MIDDLE.000000000'
pattern = 'CN=(\w+)\.(\w+)\.'
match = regex.search(pattern, string)

parsedResult = match.groups()[1] + '.' + match.groups()[0]

Unfortunately, I am matching a pattern using Keycloaks X.509 certmap web form . I am limited to using only one regular expression, and the regular expression can only contain one capturing group. This is an HTML form so there is no actual code used here, just a single regular expression.

It seems as if i need to have sub capturing groups, and return the second matched group first, and then the first matched group, all within the main capturing group. Is it possible for something like this to be done?

Also, I assume we are limited to whatever features are supported by Java because that is what the app runs on.

I don't think this is possible with just one capturing group. If I read the documentation of keycloak correctly, the capturing group is actually the result of the regular expression. So you can either match FIRST or LAST or both in the original order, but not reorder.

Yes, it is possible. This expression might help you to do so:

CN=([A-Z]+)\.(([A-Z]+)+)\.([A-Z]+)\.([0-9]+)

Demo

在此输入图像描述

RegEx

If this wasn't your desired expression, you can modify/change your expressions in regex101.com . For example, you add reduce the boundaries of the expression and much simplify it, if you want. For example, this would also work:

CN=(\w+)\.(\w+)(.*) 

RegEx Circuit

You can also visualize your expressions in jex.im :

在此输入图像描述

Python Test

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"CN=([A-Z]+)\.(([A-Z]+)+)\.([A-Z]+)\.([0-9]+)"

test_str = "CN=LAST.FIRST.MIDDLE.000000000"

subst = "\\2\\.\\1"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript Demo

 const regex = /CN=([AZ]+)\\.(([AZ]+)+)\\.([AZ]+)\\.([0-9]+)/gm; const str = `CN=LAST.FIRST.MIDDLE.000000000`; const subst = `$2\\.$1`; // The substituted value will be contained in the result variable const result = str.replace(regex, subst); console.log('Substitution result: ', result); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM