I have a very old (and strangely delimited) string that represents a table and I want to get all text in between two "tags" (they are an abomination... here they are in all of their glory):
<<<NAME=Test User>>>
<<<DATE=11/06/2014>>>
|||COMMENTS_FOLLOW_UP=\\myserver\Reporter\testu\20140611.rtf|||
|||COMMENTS_APPOINTMENT_LIST=\\myserver\Reporter\testu\COMMENTS_APPOINTMENT_LIST_20140611.rtf|||
~~~ START MONTHLY BREAKDOWN ~~~
### ROW START ###
<<<ACTIVITY=Target Group Support>>>
<<<PERCENTAGE_OF_TIME_TAKEN_FOR_THE_MONTH=25%>>>
### ROW END ###
### ROW START ###
<<<ACTIVITY=Non-target Group Support>>>
<<<PERCENTAGE_OF_TIME_TAKEN_FOR_THE_MONTH=25%>>>
### ROW END ###
### ROW START ###
<<<ACTIVITY=Networking/Guest Speaking Activities>>>
<<<PERCENTAGE_OF_TIME_TAKEN_FOR_THE_MONTH=25%>>>
### ROW END ###
### ROW START ###
<<<ACTIVITY=Processing initial calls, making appointments, completing reports and other tasks>>>
<<<PERCENTAGE_OF_TIME_TAKEN_FOR_THE_MONTH=25%>>>
### ROW END ###
### ROW START ###
<<<ACTIVITY=Total>>>
<<<PERCENTAGE_OF_TIME_TAKEN_FOR_THE_MONTH=100%>>>
### ROW END ###
~~~ END MONTHLY BREAKDOWN ~~~
~~~ START EVENTS ~~~
### ROW START ###
<<<DATE=11/06/2014 12:00:00 AM>>>
<<<EVENT_NAME=Test's Event>>>
<<<NAME_OF_ORGANISATION/GROUP=Tests Org>>>
<<<PARTICIPANT_GROUP=Test>>>
<<<NUMBER_OF_PARTICIPANTS=50>>>
### ROW END ###
~~~ END EVENTS ~~~
So I need to get the text between the delimiters ~~~ START XXX ~~~
and ~~~ END XXX ~~~
So here's the pattern I whipped up: ~~~ START .+~~~(.*)~~~ END .+~~~
;
As you can see, a master of the Regex-Fu, I am not.
NOTE: I am using the SingleLine
flag.
The Problem : This Matches the correct text but only returns one group, that of the body text of the first table tag. How do I get the C# regex-a-tron 9000 to also return the the body text from the second tag in a second match group?
You can use Regex.Matches :
var matches = Regex.Matches(input_string, regex);
foreach (var m in matches)
{
// do whatever
}
Or, you can get a match, then get the next match, etc:
var m = Regex.Match(input_string, regex);
while (m.Success)
{
// do something with this match
// then get the next match
m = m.NextMatch();
}
First off, I recommend you change your regex to this:
(?s)~~~ START ([^~]*).*?END \1 ~~~
START
, the ([^~]*)
captures the title of the block. This ensures that we can make sure the END
matches later. .*?
matches up to... \\
) and closing tildes. Sample Code
Here is a full program you can test it with. I haven't tried it. You'll need to paste the string in there.
using System;
using System.Text.RegularExpressions;
using System.Collections.Specialized;
class Program {
static void Main() {
string s1 = @"PASTE YOUR STRING HERE";
var myRegex = new Regex(@"(?s)~~~ START ([^~]*).*?END \1 ~~~");
MatchCollection AllMatches = myRegex.Matches(s1);
Console.WriteLine("\n" + "*** Matches ***");
if (AllMatches.Count > 0) {
foreach (Match SomeMatch in AllMatches) {
Console.WriteLine("Title: " + SomeMatch.Groups[1].Value);
Console.WriteLine("Overall Match: " + SomeMatch.Value);
}
}
Console.WriteLine("\nPress Any Key to Exit.");
Console.ReadKey();
} // END Main
} // END Program
You need to call the regex matcher multiple times in a loop, until there is no match. Consider modifying the expression to avoid backtracking - in your case, this is very possible, because .+
is greedy (as opposed to "reluctant").
Here is a small demo of how you can do it:
var regex = new Regex("~~~ START ([^~]+)~~~([^~]*)~~~ END ([^~]+)~~~", RegexOptions.Multiline);
var m = regex.Match(Data);
while (m.Success) {
Console.WriteLine("------ Start: {0} --------", m.Groups[1]);
Console.WriteLine(m.Groups[2]);
Console.WriteLine("------ End: {0} --------", m.Groups[3]);
m = m.NextMatch();
}
This example running on ideone.
Note the changes above - I replaced .
with [^~]
to match up to the first squiggly, and I also captured the content of the start and end tags for printing.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.