简体   繁体   中英

Multiline search and replace using C#/class Regex

I've some stored procedures that contain stuff like this:

SELECT columnA, columnB, COUNT(*) AS "COUNT" INTO temporaryTable
FROM tableA
WHERE columnA = "A"
  AND ISNULL(columnB, "B") = "B"
GROUP BY columnA, columnB
HAVING columnA = "A"
  AND ISNULL(columnB, "B") = "B"
SELECT * FROM temporaryTable -- There is not necessary to have an empty line between two instructions.

As said, there are procedures, so many instructions are in the same script.

I load each of theses procedures in a StringBuilder (that contains the same script that shown above).

I want to remove the HAVING part if (and only IF !). It is exactly the same as in the WHERE part (as above).

So I immediatly thought at the regular expressions.

I've something like this:

    static string RemoveHaving(Match m)
    {
        if (m.Groups[3].Value == m.Groups[7].Value)
        { /* WHERE == HAVING */
            Console.WriteLine("Same");
            return string.Concat(m.Groups[1].Value, m.Groups[9].Value);
        }

        Console.WriteLine("Not Same");
        return m.Groups[0].Value;
    }

    static void Main(string[] args)
    {
        // For the example:
        StringBuilder procedure = new StringBuilder();
        procedure.Append(@"
            SELECT columnA, columnB, COUNT(*) AS "COUNT" INTO temporaryTable
            FROM tableA
            WHERE columnA = "A"
              AND ISNULL(columnB, "B") = "B"
            GROUP BY columnA, columnB
            HAVING columnA = "A"
              AND ISNULL(columnB, "B") = "B"
            SELECT * FROM temporaryTable -- There is not necessary to have an empty line between two instructions.");

        Regex reg = new Regex(@"((.*)where(.*)([\s^]+)group\s*by(.*)([\s^]+))having(.*)([\s^]+(SELECT|INSERT|UPDATE|DELETE))",
            RegexOptions.Compiled |
            RegexOptions.IgnoreCase |
            RegexOptions.Multiline);

        string newProcedure = reg.Replace(procedure, (MatchEvaluator)RemoveHaving);
        Console.WriteLine("---");
        Console.WriteLine(newProcedure);
        Console.WriteLine("---");
    }

It works, but it does not seem to be the best way...

How do I detect safely the end of the HAVING?

How would you manage this work?

First thought is this:

string pattern = @"WHERE\s+([\s\S]*?)\s+HAVING\s+\1\s+(SELECT|$)";
string output = Regex.Replace(input, pattern, @"WHERE $1 SELECT");

However, this will only work if the statement is immediately followed by the SELECT keyword or an end-of-line. Different use of whitespace in the conditionals will also throw it off, as will reordering of subclauses. If you want something that's going to do this in a robust way, it's going to be VERY complicated without some kind of specialized SQL parser/optimizer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM