简体   繁体   English

正则表达式查找不在引号中且不在括号中的逗号分隔符

[英]Regex find comma delimiters not in quotes AND not in parenthesis

My ultimate goal is to develop a function to convert Access IIF() statements to T-SQL (2008) CASE WHEN statments.我的最终目标是开发一个函数来将 Access IIF() 语句转换为 T-SQL (2008) CASE WHEN 语句。 I already have a VBA routine that finds the IIF and the matching closing parenthesis even if it is outside of quotes.我已经有一个 VBA 例程可以找到 IIF 和匹配的右括号,即使它在引号之外。 It is recursive and finds nested IIF() statements without using RegEx.它是递归的,可以在不使用 RegEx 的情况下查找嵌套的 IIF() 语句。 When I narrow down to the text inside the IIF parenthesis I need to identify the two comma delimiters that separate the three parameters.当我缩小到 IIF 括号内的文本时,我需要确定分隔三个参数的两个逗号分隔符。 I am having trouble when a parenthesis is inside of quotes.当括号在引号内时,我遇到了麻烦。 How can I setup the RegEx to ignore anything in quotes before processing the rest of the expression?在处理表达式的其余部分之前,如何设置 RegEx 以忽略引号中的任何内容?

I'm trying to create an expression group that will find anything inside single quotes and anything inside of parenthesis, then exclude anything that matches that group, and find the commas.我正在尝试创建一个表达式组,它将在单引号内和括号内的任何内容中找到任何内容,然后排除与该组匹配的任何内容,并找到逗号。 (Please forgive me if I'm not saying this correctly since 'capturing group' and 'non-capturing group' sometimes give me the opposite of what I expect). (如果我说得不对,请原谅我,因为“捕获组”和“非捕获组”有时会给我与我期望的相反的结果)。

Note that this solution has to work with the VBScript Regular Expression support which is basically the same as the JavaScript flavor.请注意,此解决方案必须使用与 JavaScript 风格基本相同的 VBScript 正则表达式支持。

condition, true, false <-- this is the string my IIF parsing function returns before trying to split into 3 parts. condition, true, false <-- 这是我的 IIF 解析函数在尝试拆分为 3 部分之前返回的字符串。

This is the expression that I've pieced together so far:这是我到目前为止拼凑起来的表达方式:

,(?=([^']*'[^']*')*(?![^']*'))(?![^(]*[)])

which works on this:这适用于:

a=1, nz(b,0), Left('xy,z',2)

But these lines are more challenging.但这些线路更具挑战性。 I can't find an expression that works on all of them.我找不到适用于所有这些的表达式。

a=1, '1st)', '(2nd)' 
left(right(a,5),1)='b', '1st)', '(2nd)'
a=1, Left('a,bc',1) , 'xy,z'

Here's a Regex101 that I've been working on:这是我一直在研究的 Regex101:

https://regex101.com/r/qH0wD8/2 https://regex101.com/r/qH0wD8/2

The answer is, what I was trying to do was not appropriate for RegEx.答案是,我试图做的不适合 RegEx。 Instead I solved the problem by writing a parser that handles nested parenthesis, matching quotes, and miscellaneous commas.相反,我通过编写一个处理嵌套括号、匹配引号和其他逗号的解析器来解决这个问题。

I hope you find this answer because you need a function to convert IIF() statements to CASE WHEN and not my complex RegEx inquiry.我希望你能找到这个答案,因为你需要一个函数来将 IIF() 语句转换为 CASE WHEN而不是我复杂的 RegEx 查询。 The main use case I have for this function is converting Access SQL to T-SQL for SQL Server 2008 and earlier.这个函数的主要用例是将 Access SQL 转换为 SQL Server 2008 及更早版本的 T-SQL。 SQL Server 2012 and later support the IIF() function. SQL Server 2012 及更高版本支持 IIF() 函数。 You can use this VBA function in any VBA editor.您可以在任何 VBA 编辑器中使用此 VBA 函数。 You can use Excel if Access isn't handy.如果 Access 不方便,您可以使用 Excel。

Here is a VBA function to convert Access IIF() statements to T-SQL CASE WHEN statments这是一个将 Access IIF() 语句转换为 T-SQL CASE WHEN 语句的 VBA 函数

Public Function ReplaceIIFwithCASE(ByVal strInput As String) As String
' Parse the passed string and find the first "IIF(" and parse it into
' a standard T-SQL CASE WHEN statement.  If a nested IIF is found,
' recurse and call this function again.
'
' Ben Sacherich - May 2016-Feb 2017
'
' This supports:
'   IIF() embedded anywhere in the input string.
'   Nested IIF() statements.
'   The input string containing multiple IIF() statements on the same level.
'   Strings between open and close IIF parenthesis that contains literal commas or commas for other functions.
'       Example:  IIF(a=1, nz(b,0) , Left('xy,z',2))
'
' Be aware:
'   This does not optimize the CASE statement in places where a nested IIF could
'     be written as a single CASE statement.
'   It will fail if text inside IIF() containes the pipe character |. If you
'     need to process statements with this character, modify this routine
'     to use another temporary character for |.
'
' Try these in the Immediate window:
'   ? ReplaceIIFwithCASE("IIF(a=1, nz(b,0) , Left('xy,z',2))")
'   ? ReplaceIIFwithCASE("IIf(x='one',IIf(Abs(Z)=1,2,3),'three')")
'   ? ReplaceIIFwithCASE("IIF(a=1,'1st)', '2nd)')")
'   ? ReplaceIIFwithCASE("SELECT Name, IIF(Gender='M', 'Boy', 'Girl') FROM Students")
'
' How this works:
'   Find "IIF(" in the passed string.  Return original string if not found.
'   Search for the matching closing parenthesis.
'   When the match is found, recurse and make sure an "IIF(" is not nested.
'   After recursing, replace the IIF with a CASE statement.
'       - Once I find the inner part of an IIF this will use the Split function
'         to delimit by commas "," into an array.
'       - Then it looks at each array element. If it contains an odd number of
'         single or double quote characters or different number of opening and
'         closing parenthesis, it combines the array element part with the next
'         part and tests again.
'       - When there are matched single/double quotes and equivalent number of
'         parenthesis it holds that part and appends the "|" character.  This
'         means that it has identified one of the 3 parameters that is passed
'         to the IIF function.
'       - Then it splits the string by the "|" character into three pieces
'         and builds the CASE statement.
'   Continue searching the passed string for another occurrence of "IIF(" (not nested).

    Dim lngFuncStart    As Long
    Dim lngPosition     As Long
    Dim intStack        As Integer
    Dim strFunction     As String
    Dim strChar         As String
    Dim strQuoteChar    As String
    Dim bolInQuotes     As Boolean
    Dim strSplit()      As String
    Dim ReturnValue     As String

    On Error GoTo ErrorHandler

    strFunction = "IIF("
    strQuoteChar = "'"      ' Define the style of quotes to look for and exclude.
    bolInQuotes = False     ' We are currently not inside quotes.

    lngFuncStart = InStr(1, strInput, strFunction, vbTextCompare)

    If lngFuncStart > 0 Then
        lngFuncStart = lngFuncStart + Len(strFunction)
        intStack = 1
        lngPosition = lngFuncStart

        Do While lngPosition <= Len(strInput)
        ' Use a WHILE loop instead of a FOR loop because the current and end positions will change inside the loop.

            strChar = Mid(strInput, lngPosition, 1)

            If strChar = strQuoteChar Then
                bolInQuotes = Not bolInQuotes
                ' Move on to the next character

            ElseIf bolInQuotes = False Then
                ' We are currently not inside quotes.

                Select Case strChar
                    Case ")"
                        ' Closing a group
                        intStack = intStack - 1
                    Case "("
                        ' Starting new group
                        intStack = intStack + 1
                End Select

                If intStack = 0 Then ' Found closing parenthesis.

                    ' See if there is a nested match.  ### Recursive ###
                    ReturnValue = ReplaceIIFwithCASE(Mid(strInput, lngFuncStart, lngPosition - lngFuncStart))

                    ' Begin parsing commas.
                    strSplit() = Split(ReturnValue, ",")

                    Dim strPart         As String
                    Dim strRebuilt      As String
                    Dim i               As Integer

                    strRebuilt = ""
                    If UBound(strSplit()) > 2 Then ' There are more than 2 commas.  Piece together the parts.

                        strPart = strSplit(0)
                        For i = 1 To UBound(strSplit)

                            If UBound(Split(strPart, "'")) Mod 2 = 0 _
                              And UBound(Split(strPart, """")) Mod 2 = 0 _
                              And UBound(Split(strPart, "(")) = UBound(Split(strPart, ")")) Then
                                ' Number of single quotes is Even or Zero (matched)
                                ' Number of double quotes is Even or Zero (matched)
                                ' Number of parenthesis is matched

                                ' Add the "|" symbol where the IIF should have commas.
                                strRebuilt = strRebuilt & "|" & strPart
                                strPart = strSplit(i)
                            Else
                                strPart = strPart & "," & strSplit(i)
                            End If
                        Next
                        ReturnValue = Mid(strRebuilt & "|" & strPart, 2)

                        strSplit() = Split(ReturnValue, "|")
                    End If

                    If UBound(strSplit) = 2 Then
                        ' IIF has 3 parameters and is the normal case.
                        '--- Replace the IIF statement with CASE WHEN ---
                        ' CASE statement help:  https://msdn.microsoft.com/en-us/library/ms181765.aspx
                        ReturnValue = "(CASE WHEN " & Trim(strSplit(0)) & " THEN " & Trim(strSplit(1)) & " ELSE " & Trim(strSplit(2)) & " END)"
                        'ReturnValue = "(CASE WHEN...)"
                        If Right(Mid(strInput, 1, lngFuncStart - Len(strFunction) - 1), 2) = vbCrLf Then
                            ' Don't bother to add a CrLf
                        Else
                            ' Add a CrLf before the CASE statement to make identification easier.
                            ' Comment this out if you don't want it added.
                            ReturnValue = vbCrLf & ReturnValue
                        End If
                        strInput = Mid(strInput, 1, lngFuncStart - Len(strFunction) - 1) & ReturnValue & Mid(strInput, lngPosition + 1)
                    Else
                        ' Something is wrong.  Return the original IIF statement.
                        ' Known issues:
                        '       Text inside IIF() contained pipe character |
                        '       Text contained unmatched parenthesis, maybe inside of a literal string like '1st)'
                        ReturnValue = "IIF(" & ReturnValue & ") /*### Unable to parse IIF() ###*/ "
                        strInput = Mid(strInput, 1, lngFuncStart - Len(strFunction) - 1) & ReturnValue & Mid(strInput, lngPosition + 1)

                    End If

                    '--- Check to see if there is another function call following the one just addressed. ---
                    lngFuncStart = InStr(lngFuncStart + Len(ReturnValue) - Len(strFunction), strInput, strFunction, vbTextCompare)
                    If lngFuncStart > 0 Then
                        ' Another IIF function call is at the same level as the one just processed.
                        lngFuncStart = lngFuncStart + Len(strFunction)
                        intStack = 1
                        lngPosition = lngFuncStart
                    Else
                        ReturnValue = strInput
                        Exit Do
                    End If

                End If
            End If
            lngPosition = lngPosition + 1
        Loop

    Else
        ' Function not found in passed string.
        ReturnValue = strInput
    End If

    ReplaceIIFwithCASE = ReturnValue

    Exit Function

ErrorHandler:

    MsgBox "Error #" & Err.Number & " - " & Err.Description & vbCrLf & "in procedure ReplaceIIFwithCASE()" _
        & vbCrLf & "Input:  " & strInput, vbExclamation, "Error"

End Function

You can use recursive matching:您可以使用递归匹配:

(?>(?>\([^()]*(?R)?[^()]*\))|(?>'[^']*')|(?>[^()' ,]+))+

The outer match is allowed to be repeated (?>...)+ .允许重复外部匹配(?>...)+ Inside, there are three options.在里面,有三个选项。 The first one matches balanced () :第一个匹配平衡()

(?>\([^()]*(?R)?[^()]*\)

The second matches anything between single quotes '...' :第二个匹配单引号'...'之间的任何内容:

(?>'[^']*')

The third matches anything except () , or ' , or comma, or space:第三个匹配除()' 、逗号或空格之外的任何内容:

(?>[^()' ,]+)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM