簡體   English   中英

解析T-SQL以提取WHERE子句的一部分

[英]Parsing T-SQL To Extract Part of WHERE Clause

我有一個包含“曲線”的大型SQL數據庫。 每條曲線都有一個ID(曲線)。 我正在嘗試確定每個曲線的主要用戶以及是否使用了所有曲線。 為此,DBA提供了針對數據庫執行的所有語句的日志。

這些陳述可能非常復雜。 我要做的就是提取要查詢的curveid。

示例語句如下:

WITH G AS ( SELECT [Timevalue] FROM [mc].[GranularityLookup] 
WHERE [TimeValue] BETWEEN '19-Jul-2017 00:00' AND '30-Sep-2017 00:00' 
AND [1 Hr] = 1), 
D AS ( SELECT [CurveID], [DeliveryDate], [PublishDate], AVG([Value]) Value, MAX([PeriodNumber]) PeriodNumber 
FROM mc.CURVEID_6657_1_LATEST data 
JOIN 
(SELECT CurveID ID, DeliveryDate dDate, MAX(PublishDate) pDate 
FROM mc.CURVEID_6657_1_LATEST
WHERE CurveID = 90564
    AND DeliveryDate >= '19-Jul-2017 00:00' AND DeliveryDate <= '30-Sep-2017 00:00'
GROUP BY DeliveryDate,  CurveID ) Dates 
ON data.DeliveryDate = dates.dDate AND data.PublishDate = dates.pDate 
WHERE data.CurveID = 90564
AND data.DeliveryDate >= '19-Jul-2017 00:00' AND data.DeliveryDate <= '30-Sep-2017 00:00'
GROUP BY [CurveID], [PublishDate], [DeliveryDate] )
SELECT 
G.[TimeValue] [Deliver
yDate] , D.[PublishDate], D.[Value], D.[PeriodNumber]
FROM 
G
LEFT JOIN 
D
ON 
G.[TimeValue] = D.[DeliveryDate]
ORDER BY DeliveryDate ASC, PeriodNumber ASC, publishDate DESC

從此語句中,我感興趣的只是提取用戶查詢的curveid 90564。

該語句也可能類似於以下之一:

SELECT * FROM anytable WHERE curveid = 123 AND deliverydate BETWEEN '2017-01-01' AND 2017-02-01'

要么

SELECT * FROM mc.anytable WHERE curveid IN (1,2,3,4,5,6,7)

同樣,我只想知道曲線ID。 我不在乎其他任何條款。

我正在使用Microsoft.SqlServer.TransactSql.ScriptDom命名空間來解析SQL,現在我可以使用類似於以下代碼(與其他一些示例組合在一起)識別所有WHERE語句:

string sql = @"WITH 
            G AS ( SELECT [Timevalue] FROM [mc].[GranularityLookup] 
            WHERE [TimeValue] BETWEEN '19-Jul-2017 00:00' AND '30-Sep-2017 00:00' 
            AND [1 Hr] = 1), 
            D AS ( SELECT [CurveID], [DeliveryDate], [PublishDate], AVG([Value]) Value, MAX([PeriodNumber]) PeriodNumber 
            FROM mc.CURVEID_6657_1_LATEST data 
            JOIN 
            (SELECT CurveID ID, DeliveryDate dDate, MAX(PublishDate) pDate 
            FROM mc.CURVEID_6657_1_LATEST
            WHERE CurveID = 90564
                AND DeliveryDate >= '19-Jul-2017 00:00' AND DeliveryDate <= '30-Sep-2017 00:00'
            GROUP BY DeliveryDate,  CurveID ) Dates 
            ON data.DeliveryDate = dates.dDate AND data.PublishDate = dates.pDate 
            WHERE data.CurveID = 90564
            AND data.DeliveryDate >= '19-Jul-2017 00:00' AND data.DeliveryDate <= '30-Sep-2017 00:00'
            GROUP BY [CurveID], [PublishDate], [DeliveryDate] )
            SELECT 
            G.[TimeValue] [Deliver
            yDate] , D.[PublishDate], D.[Value], D.[PeriodNumber]
            FROM 
            G
            LEFT JOIN 
            D
            ON 
            G.[TimeValue] = D.[DeliveryDate]
            ORDER BY DeliveryDate ASC, PeriodNumber ASC, publishDate DESC";
            var parser = new TSql120Parser(false);

            IList<ParseError> errors;
            var fragment = parser.Parse(new StringReader(sql), out errors);

            var whereVisitor = new WhereVisitor();
            fragment.Accept(whereVisitor);

            //  I now have all WHERE clauses in whereVisitor.WhereStatements

class WhereVisitor : TSqlConcreteFragmentVisitor
{
    public readonly List<WhereClause> WhereStatements = new List<WhereClause>();

    public override void Visit(WhereClause node)
    {
        WhereStatements.Add(node);
    }

}

whereVisitor.WhereStatements(此示例中為3)中的每個子句都公開了一個稱為SearchCondition的屬性。 不幸的是,這是我沒有想法的地方。 我要實現的是如下邏輯:

foreach (var clause in whereVisitor.WhereStatements)
{
    //  IF any part of the clause filters based on curveid THEN

    //        Capture curveIDs

    //  END IF
}

其他詳情:

  • 使用C#(.net 4.0)
  • SQL Server 2008
  • DLL是Microsoft.SqlServer.TransactSql.ScriptDom(在我的情況下位於'c:\\ Program Files(x86)\\ Microsoft SQL Server \\ 130 \\ Tools \\ PowerShell \\ Modules \\ SQLPS \\ Microsoft.SqlServer.TransactSql.ScriptDom.dll “)

編輯1

一些其他信息:

  • CurveID是另一個表的鍵。 在這種情況下,對其進行操作是沒有意義的(例如,curveId + 1或curveId <= 10)。

編輯2(部分解決方案)

在子句類似於curveid = 123的情況下,請以下訪問者幫助:

class CurveIdVisitor : TSqlConcreteFragmentVisitor
{
    public readonly List<int> CurveIds = new List<int>();

    public override void Visit(BooleanComparisonExpression exp)
    {
        if (exp.FirstExpression is ColumnReferenceExpression && exp.SecondExpression is IntegerLiteral )
        {
            //  there is a possibility that this is of the ilk 'curveid = 123'
            //  we will look for the 'identifier'
            //  we take the last if there are multiple.  Example:
            //      alias.curveid
            //  goives two identifiers: alias and curveid
            if (
                ((ColumnReferenceExpression) exp.FirstExpression).MultiPartIdentifier.Identifiers.Last().Value.ToLower() ==
                "curveid")
            {
                //  this is definitely a curveid filter
                //  Now to find the curve id
                int curveid = int.Parse(((IntegerLiteral) exp.SecondExpression).Value);
                CurveIds.Add(curveid);
            }
        }

終於解決了這個問題,希望以后對其他人有所幫助。 也許其他人可能會花些時間閱讀並提供更好的解決方案。

public class SqlParser
{
    public List<int> GetQueriedCurveIds(string sql)
    {
        var parser = new TSql120Parser(false);

        IList<ParseError> errors;
        var fragment = parser.Parse(new StringReader(sql), out errors);

        List<int> curveIds = new List<int>();
        CurveIdVisitor cidv = new CurveIdVisitor();
        InPredicateVisitor inpv = new InPredicateVisitor();
        fragment.AcceptChildren(cidv);
        fragment.AcceptChildren(inpv);

        curveIds.AddRange(cidv.CurveIds);
        curveIds.AddRange(inpv.CurveIds);
        return curveIds.Distinct().ToList();
    }
}



class CurveIdVisitor : TSqlConcreteFragmentVisitor
{
    public readonly List<int> CurveIds = new List<int>();

    public override void Visit(BooleanComparisonExpression exp)
    {
        if (exp.FirstExpression is ColumnReferenceExpression && exp.SecondExpression is IntegerLiteral )
        {
            //  there is a possibility that this is of the ilk 'curveid = 123'
            //  we will look for the 'identifier'
            //  we take the last if there are multiple.  Example:
            //      alias.curveid
            //  goives two identifiers: alias and curveid
            if (
                ((ColumnReferenceExpression) exp.FirstExpression).MultiPartIdentifier.Identifiers.Last().Value.ToLower() ==
                "curveid")
            {
                //  this is definitely a curveid filter
                //  Now to find the curve id
                int curveid = int.Parse(((IntegerLiteral) exp.SecondExpression).Value);
                CurveIds.Add(curveid);
            }
        }
    }
}

class InPredicateVisitor : TSqlConcreteFragmentVisitor
{
    public readonly List<int> CurveIds = new List<int>();

    public override void Visit(InPredicate exp)
    {
        if (exp.Expression is ColumnReferenceExpression)
        {
            if (
                ((ColumnReferenceExpression) exp.Expression).MultiPartIdentifier.Identifiers.Last().Value.ToLower() ==
                "curveid")
            {
                foreach (var value in exp.Values)
                {
                    if (value is IntegerLiteral)
                    {
                        CurveIds.Add(int.Parse(((IntegerLiteral)value).Value));
                    }
                }
            }
        }
    }
}

這是縮減代碼以示答案。 在現實生活中,您將需要檢查ParseError集合並添加一些錯誤處理!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM