I'm trying to parse match a file name like xxxxSystemCheckedOut.png where xxx can be any prefix to the file name and System and Checked out are keywords to identify.
EDIT: I wasn't being clear on all the possible file names and their results. So filenames can be
this is my current regex, it matchs the file name like I want it to but can't get it to group in the right way. Using the previous example I'd like the groups to be like this:
(?:([\\w]*)(CheckedOut|System)+(\\.[az]*)\\Z)
[EDIT] Give this a try.
Pattern: (.*?)(?:(System)|(CheckedOut)|(Cached))+(.png)\\Z
String: xxxxTESTSystemCached.png
Groups:
UPDATE - Based on comments to other answers: This should work for all combinations of System/CheckedOut/Cached:
(\w+?)(System)?(CheckedOut)?(Cached)?(.png)
https://regex101.com/r/qT2sX9/1
Note that that the groups for missing keywords will still exist, so for example:
"abcdSystemCached.png" gives:
Match 1 : "abcd"
Match 2 : "System"
Match 3 :
Match 4 : "Cached"
Match 5 : ".png"
And "1234CheckedOutCached.png" gives:
Match 1 : "abcd"
Match 2 :
Match 3 : "CheckedOut"
Match 4 : "Cached"
Match 5 : ".png"
This is kinda nice as you know a particular keyword will always be a certain position, so it becomes like a flag.
From the comments: I actually need the groups separately so I know how to operate on the image, each keyword ends in different operations on the image
You really don't need to use separate capture buffers on the keywords.
If you need the order of the matched keywords relative to one another,
you'd use the below code. Even if you didn't need the order it could be
done like that.
( .*? ) # (1)
( System | CheckedOut )+ # (2)
\.png $
C#:
string fname = "xxxxSystemCheckedOutSystemSystemCheckedOutCheckedOut.png";
Regex RxFname = new Regex( @"(.*?)(System|CheckedOut)+\.png$" );
Match fnameMatch = RxFname.Match( fname );
if ( fnameMatch.Success )
{
Console.WriteLine("Group 0 = {0}", fnameMatch.Groups[0].Value);
Console.WriteLine("Group 1 = {0}", fnameMatch.Groups[1].Value);
Console.WriteLine("Last Group 2 = {0}\n", fnameMatch.Groups[2].Value);
CaptureCollection cc = fnameMatch.Groups[2].Captures;
Console.WriteLine("Array and order of group 2 matches (collection):\n");
for (int i = 0; i < cc.Count; i++)
{
Console.WriteLine("[{0}] = '{1}'", i, cc[i].Value);
}
}
Output:
Group 0 = xxxxSystemCheckedOutSystemSystemCheckedOutCheckedOut.png
Group 1 = xxxx
Last Group 2 = CheckedOut
Array and order of group 2 matches (collection):
[0] = 'System'
[1] = 'CheckedOut'
[2] = 'System'
[3] = 'System'
[4] = 'CheckedOut'
[5] = 'CheckedOut'
I'm no Regex wizard, so if this can be shortened/tidied I'd love to know, but this groups like you want based on the keywords you gave:
Edited based on OPs clarification of the file structure
(\w+?)(system)?(checkedout)?(cached)?(.png)/ig
Edit: beercohol and jon have me beat ;-)
I read somewhere (can't remember where) the more precise your pattern is, the better performance you'll get from it.
So try this pattern
"(\\w+?)(?:(System)|(CheckedOut))+(.png)"
Code Sample:
List<string> fileNames = new List<string>
{
"xxxxSystemCheckedOut.png", // Good
"SystemCheckedOut.png", // Good
"1afweiljSystemCheckedOutdgf.png", // Bad - Garbage characters before .png
"asdf.png", // Bad - No System or CheckedOut
"xxxxxxxSystemCheckedOut.bmp", // Bad - Wrong file extension
"xxSystem.png", // Good
"xCheckedOut.png" // Good
};
foreach (Match match in fileNames.Select(fileName => Regex.Match(fileName, "(\\w+?)(?:(System)|(CheckedOut))+(.png)")))
{
List<Group> matchedGroups = match.Groups.Cast<Group>().Where(group => !String.IsNullOrEmpty(group.Value)).ToList();
if (matchedGroups.Count > 0)
{
matchedGroups.ForEach(Console.WriteLine);
Console.WriteLine();
}
}
Results:
xxxxSystemCheckedOut.png
xxxx
System
CheckedOut
.png
SystemCheckedOut.png
System
CheckedOut
.png
xxSystem.png
xx
System
.png
xCheckedOut.png
x
CheckedOut
.png
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.