简体   繁体   中英

How can I parse T-SQL to AST using the classes in Microsoft.SqlServer.Management.SqlParser

The class Parser only has the method Parse that returns a ParseResult , and seems I can do nothing with the ParseResult . How can I get the abstract syntax tree of my sql statement, or just the metadata tokens can be parsed from sql is fine.

I did some research, and found that I can use reflection to generate the parsed information xml file using SqlScript.WriteXml . And here's is the sample code, I don't know if there is any better way.

var rst = Parser.Parse(File.ReadAllText(@"*.sql"));
var fieldInfo = rst.GetType().GetField("sqlScript", BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField);
var script = fieldInfo.GetValue(rst);
var writer = XmlWriter.Create("*.xml");
script.GetType().InvokeMember("WriteXml", BindingFlags.NonPublic| BindingFlags.Instance | BindingFlags.InvokeMethod
    , null, script, new object[] { writer });
writer.Close();

I was excited you found the AST at all! Working with it directly requires use of dynamic variables to access the .Child collection of the objects in the internal Microsoft.SqlServer.Management.SqlParser.SqlCodeDom namespace.

Rather than calling WriteXml, I recommend accessing existing the Xml string property. This saves having to deal with issues caused by nesting SQL comments within XML comments (can't have -- inside an XML comment; -- becomes - - ).

var rst = Parser.Parse(File.ReadAllText(@"*.sql"));
var script = rst.GetType().GetProperty("Script", BindingFlags.NonPublic | BindingFlags.Instance).GetValue(rst);
var xml = script.GetType().BaseType.GetProperty("Xml").GetValue(script) as String;

If you are genuinely willing to settle for just the metadata tokens, I found a PowerShell example here ; it works out to something like this (boilerplate metadata factory included for completeness):

// using Microsoft.SqlServer.Management.SqlParser.Parser;
// ...

var sql = File.ReadAllText(@"*.sql");
var scanner = new Scanner(new ParseOptions());
int scannerState = 0;
scanner.SetSource(sql, 0);
var allTokens = new List<MSSQL_Token_JS>();
MSSQL_Token_JS curToken = null;
do
{
    curToken = MSSQL_Token_JS.GetNext(scanner, sql, ref scannerState);
    allTokens.Add(curToken);
}
while (curToken.Value != Tokens.EOF);

//...

public class MSSQL_Token_JS
{
    public readonly string SourceSQL;
    public readonly Tokens Value;
    public readonly string Text;
    public readonly int ScannerState;
    public readonly int Start;
    public readonly int End;
    public readonly bool IsPairMatch;
    public readonly bool IsExecAutoParamHelp;

    private MSSQL_Token_JS(string SourceSQL, int tokenId, int ScannerState, int Start, int End, bool IsPairMatch, bool IsExecAutoParamHelp)
    {
        this.SourceSQL = SourceSQL;
        this.Value = (Tokens)tokenId;
        if (this.Value != Tokens.EOF)
        {
            this.Text = SourceSQL.Substring(Start, End - Start + 1);
        }

        this.ScannerState = ScannerState;
        this.Start = Start;
        this.End = End;
        this.IsPairMatch = IsPairMatch;
        this.IsExecAutoParamHelp = IsExecAutoParamHelp;
    }

    public static MSSQL_Token_JS GetNext(Scanner scanner, string SourceSQL, ref int ScannerState)
    {
        int start, end;
        bool isPairMatch, isExecAutoParamHelp;
        int tokenId = scanner.GetNext(ref ScannerState, out start, out end, out isPairMatch, out isExecAutoParamHelp);
        return new MSSQL_Token_JS(SourceSQL, tokenId, ScannerState, start, end, isPairMatch, isExecAutoParamHelp);
    }

    public override string ToString()
    {
        return String.Format("{0}:{1}", this.Value, this.Text);
    }
}

If you add Nuget package Microsoft.SqlServer.Management.SqlParser

you can use built-in parser.

using Microsoft.SqlServer.Management.SqlParser.Parser;
using Microsoft.SqlServer.Management.SqlParser.SqlCodeDom;
using static System.Console;

var parseResult = Parser.Parse("SELECT Id, Name FROM dbo.Customers WHERE State = 'MA'");
foreach (var batch in parseResult.Script.Batches)
{
    foreach (var statement in batch.Statements)
    {
        switch (statement)
        {
            case SqlSelectStatement selectStatement:
                ProcessSelectStatement(selectStatement);
                break;
            default:
                WriteLine("Unsupported statment. Printing inner XML");
                WriteLine(statement.Xml);
                break;
        }
    }
}

void ProcessSelectStatement(SqlSelectStatement selectStatement)
{
    var query = (SqlQuerySpecification)selectStatement.SelectSpecification.QueryExpression;
    var selectClause = query.SelectClause;
    WriteLine($"Select columns {string.Join(", ", selectClause.SelectExpressions.Select(_ => _.Sql))}");
    var fromClause = query.FromClause;
    WriteLine($"from tables {string.Join(", ", fromClause.TableExpressions.Select(_ => _.Sql))}");
    var whereClause = query.WhereClause;
    WriteLine($"where {whereClause.Expression.Sql}");
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM