简体   繁体   中英

Regex to split on capital letters and numbers that are acronyms

The following regex pattern splits strings on capital letters (eg. GetDatabaseIDE becomes Get Database IDE )

Regex.Matches("GetDatabaseIDE", @"([A-Z]+)([^A-Z])*").Cast<Match>().Select(m => m.Value);

How could this regex pattern be changed to include numbers and that still facilitates for the current return pattern? (eg. GetDatabase2FA should return Get Database 2FA )

EDIT:

The desired regex pattern should split the strings like the following..

2FAGetDatabase ---> 2FA Get Database

Get2FADatabase ---> Get 2FA Database

GetDatabase2FA ---> Get Database 2FA

MY SOLUTION:

public static string ToSentence(this string text)
{
    string pattern;
    if (text.Any(char.IsDigit))
    {
        pattern = @"(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|(?<=[a-z])(?=[0-9])";
        return string.Join(" ", Regex.Split(text, pattern));
    }
    else
    {
        pattern = @"([A-Z]+)([^A-Z])*";
        return string.Join(" ", Regex.Matches(text, @"([A-Z]+)([^A-Z])*").Cast<Match>().Select(m => m.Value));
    }
}

How about this?

var pattern = @"(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|(?<=[a-z])(?=[0-9])";
Regex.Split("2FAGetDatabase", pattern);
//2FA Get Database
Regex.Split("Get2FADatabase", pattern);
//Get 2FA Database
Regex.Split("GetDatabase2FA", pattern);
//Get Database 2FA
Regex.Split("GetIDEDatabase2FA", pattern);
//Get IDE Database 2FA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM