简体   繁体   中英

Regex get group block with specific start and end each group

If we had some string like :

----------DBVer=1
/*some sql script*/
----------DBVer=1
----------DBVer=2
/*some sql script*/
----------DBVer=2
----------DBVer=n
/*some sql script*/
----------DBVer=n

Can we extract scripts between first DBVer=1 and second DBVer=1 and so on... with regex?

I thing we must have some placehoder for regex, and tel regex engine if saw DBVer=digitA pick string until DBVer=digitA again if saw DBVer=digitB pick string until DBVer=digitB and so on...

Can we implement this with regex and if we can how?

Yes, using backreferences and lookarounds, you can capture the scripts:

var pattern = @"(?<=(?<m>-{10}DBVer=\d+)\r?\n).*(?=\r?\n\k<m>)";
var scripts = Regex.Matches(input, pattern, RegexOptions.Singleline)
                .Cast<Match>()
                .Select(m => m.Value);

Here, we capture the m (marker) group with (?<m>-{10}DBVer=\\d+) and reuse the m value later in the regex with \\k<m> to match against the end marker.

In order for .* to match newline chars, it is necessary to turn on Singleline mode. This, in turn, means we have to be specific about our newlines. In Singleline mode, these can be accounted for in a non-platform specific way with \\r?\\n .

Try code below. Not RegEx but works very well.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;


namespace ConsoleApplication6
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.txt";
        static void Main(string[] args)
        {

            Script.ReadScripts(FILENAME);   
        }
    }
    public class Script
    {
        enum State
        {
            Get_Script,
            Read_Script
        }

        public static List<Script> scripts = new List<Script>();
        public int version { get; set; }
        public string script { get; set; }

        public static void ReadScripts(string filename)
        {
            string inputLine = "";
            string pattern = "DBVer=(?'version'\\d+)";
            State state = State.Get_Script;
            StreamReader reader = new StreamReader(filename);
            Script newScript = null;
            while ((inputLine = reader.ReadLine()) != null)
            {
                inputLine = inputLine.Trim();
                if (inputLine.Length > 0)
                {
                    switch (state)
                    {
                        case State.Get_Script :
                            if(inputLine.StartsWith("-----"))
                            {
                                newScript = new Script();
                                scripts.Add(newScript);
                                string version = 
                                  Regex.Match(inputLine, pattern).Groups["version"].Value;
                                newScript.version = int.Parse(version);
                                newScript.script = "";
                                state = State.Read_Script;
                            }
                            break;
                        case State.Read_Script :
                            if (inputLine.StartsWith("-----"))
                            {
                                state = State.Get_Script;
                            }
                            else
                            {
                                if (newScript.script.Length == 0)
                                {
                                    newScript.script = inputLine;
                                }
                                else
                                {
                                    newScript.script += "\n" + inputLine;
                                }
                            }
                            break;

                    }
                }
            }
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM