简体   繁体   中英

Finding Functions within SQL

In something like the following code I'm wanting to pull out just the function that has more than one parameter, and want to ignore coalesce that has 2 parameters. Please help I've been working on this for a couple of days using REGEX and think that it's possible just not something that I have been able to wrap my head around. I believe that the answer is a group and the parsing that group with a sub regex or something to that effect.

AND NOT COALESCE(UPPER(FUNCTION_TO_FIND(B.PARAM, B.TEST_PARAM1, B.TEST_PARAM2, B.TEST_PARAM3,'Routine Type', ATT_TO_DATE())),'NO_VALUE') IN (UPPER('Routine Appointment Letter'))
AND NOT UPPER(DBO.FUNCTION_TO_FIND( B.PARAM , B.TEST_PARAM1 , B.TEST_PARAM2, B.TEST_PARAM3,'Routine Type', ATT_TO_DATE())) IN (UPPER('Routine Appointment Letter'))
AND NOT COALESCE(1, 3) = 2

I would expect to find

DBO.FUNCTION_TO_FIND( B.PARAM , B.TEST_PARAM1 , B.TEST_PARAM2, B.TEST_PARAM3,'Routine Type', ATT_TO_DATE())

and

FUNCTION_TO_FIND(B.PARAM, B.TEST_PARAM1, B.TEST_PARAM2, B.TEST_PARAM3,'Routine Type', ATT_TO_DATE())

Please note that the functions will not always be the same number of layers down, but they will all have more than 2 parameters.

I have tried different versions and edits of the following balanced parenthesis function but haven't been able to have it count the parameters properly to capture the entire function.

(\\((?>[^()]|(?1))*\\))

EDITS, CLARIFICATION BELOW

EDIT1 : Please note that when looking for functions I will not have access to the server that the SQL will be run against, this has to be done completely offline.

EDIT2 : Thinking about this further I think that this is a problem to solve in a couple parts, instead of one regex, using another tool to create regex on the fly.

  1. Create a list of functions that I don't want to find that are common TO_CHAR, TO_NUMBER, UPPER, LOWER, COALESCE, MIN, MAX, AND, EXISTS, COALESCE, SUM, FROM
  2. Find the start of any function that is used within the query string, using something like the following. [[:alnum:]][^)( \\r\\n]+?\\s*?\\(
  3. Back up one character in the string and use the following code to find the matched parenthesis. (\\((?>[^()]|(?1))*\\))
  4. ...

I will try something to the effect of above and come back with the answer. In the mean time if someone has a different idea please feel free to contribute.

Ok, I gave it a try, and Microsoft.SqlServer.Management.SqlParser.Parser from the Microsoft.SqlServer.SqlManagementObjects nuget package might be the way to go.

var sql = @"SELECT .... ";
var result = Parser.Parse(sql);
var batch = result.Script.Batches.First();
var select = batch.Children.Cast<SqlSelectStatement>().First();
var selectSpec = select.SelectSpecification;

As a result of this, selectSpec.Xml contains a hierarchical XML document representing the SELECT statement.

The SqlCodeObject s provide an Accept() method that implement the Visitor pattern. Write a visitor and evaluate expressions of the types you are interested in: SqlNullScalarExpression , SqlUserDefinedScalarFunctionCallExpression , maybe more.

Be aware that the parser does not recognize user-defined functions without the dbo. schema:

            <SqlNullScalarExpression Location="((3,9),(3,137))">
              <!--COALESCE(UPPER(FUNCTION_TO_FIND(B.PARAM, B.TEST_PARAM1, B.TEST_PARAM2, B.TEST_PARAM3,'Routine Type', ATT_TO_DATE())),'NO_VALUE')-->
            </SqlNullScalarExpression>

versus

            <SqlUserDefinedScalarFunctionCallExpression Location="((4,15),(4,122))" ObjectIdentifier="DBO.FUNCTION_TO_FIND">
              <!--DBO.FUNCTION_TO_FIND( B.PARAM , B.TEST_PARAM1 , B.TEST_PARAM2, B.TEST_PARAM3,'Routine Type', ATT_TO_DATE())-->
              <SqlObjectIdentifier Location="((4,15),(4,35))" SchemaName="DBO" ObjectName="FUNCTION_TO_FIND">
                <!--DBO.FUNCTION_TO_FIND-->
                <SqlIdentifier Location="((4,15),(4,18))" Value="DBO">
                  <!--DBO-->
                </SqlIdentifier>
                <SqlIdentifier Location="((4,19),(4,35))" Value="FUNCTION_TO_FIND">
                  <!--FUNCTION_TO_FIND-->
                </SqlIdentifier>
              </SqlObjectIdentifier>
              ...
              <SqlBuiltinScalarFunctionCallExpression Location="((4,108),(4,121))" FunctionName="ATT_TO_DATE" IsStar="False">
                <!--ATT_TO_DATE()-->
              </SqlBuiltinScalarFunctionCallExpression>
            </SqlUserDefinedScalarFunctionCallExpression>

I used the following in c# mixed with regex to accomplish this task, yes it's very rough and could use some refinement but it does work. Feel free to improved the answer however I will not be able to provide ongoing support for the code below.

            Console.WriteLine("=====================================================");
        int numberOfIterations = 0;
        do
        {// Variables
            string searchResult; // Start of function call
            string functionName; // Function Name
                                 // funcFonud will be char length of function ( "IN (" length is 4)
                                 // used to trim the substring for the next itteration.
            int funcFoundNameLen = 0;
            int funcFoundAt = 0;
            int matchedParenEnd = 0; // End of Matched Paren

            // Find the start of a function call.
            string findFuncReg = "[a-zA-Z][^)( \\r\\n\\\"]+?\\s*?\\("; // fixes [[:alnum:]] issue
            if (query.Length > 0)
            {
                Match match = Regex.Match(query, findFuncReg);
                Console.WriteLine("Trying to match: "+query);
                if (match.Success && match.Index >= 0) // Function call found
                {
                    try
                    {
                        Console.WriteLine("Match was found: " + match.Index.ToString() + match.Value);
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex);
                    }
                    funcFoundNameLen = match.Length;
                    funcFoundAt = match.Index;
                    searchResult = query.Substring(funcFoundAt);

                    functionName = searchResult.Substring(0, searchResult.IndexOf('(')).Trim();
                    writeFunc(functionName);
                }
                else
                {
                    Console.WriteLine("Function Start NOT found");
                    return;
                }

                string subString = query.Substring(match.Index);
                int openParen = 0;

                // Following finds the matched paren
                for (int i = 0; i < subString.Length; i++)
                {
                    if (subString[i] == ')' && openParen == 1)
                    {
                        matchedParenEnd = i + 1; // Location of the end Paren
                        break;
                    }
                    // Following handles opening and closing of paren so that we
                    // can find the matched paren.
                    else
                    {
                        if (subString[i] == '(')
                        {
                            openParen = openParen + 1;
                        }
                        if (subString[i] == ')')
                        {
                            openParen = openParen - 1;
                        }
                    }
                }

                // Output function call.
                string subCall = subString.Substring(funcFoundNameLen, matchedParenEnd- funcFoundNameLen);
                Console.WriteLine("Sub Call: " + subCall);

                // Set up for recursive call.
                // string querySub = query.Substring(funcFoundAt);
                // Console.WriteLine("querySub: " + querySub);
                matchedParen(subCall);

                // Substring based on the location that the function call was
                // made to the end of the function call
                Console.WriteLine("Remove: " + (funcFoundAt + matchedParenEnd).ToString() + " letters");
                query = query.Substring(funcFoundAt + matchedParenEnd);
                numberOfIterations++;
            }
        } while (query.Length > 0 
        // && numberOfIterations < 1
        );

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM