简体   繁体   中英

VBA RegEx identifiying multiple patterns - Excel

I´m rather new to VBA RegEx, but thanks to this stackoverflow thread ,

I am getting to it. I have a problem and hope that somebody can help. In row 1 in Excel I have multiple Strings with different city/country attribution. Example:

A1: "/flights/munich/newyork"
A2: "flights/munich/usa"
A3:  "flights/usa/germany"
...

What I wanna have now, is a VBA that goes though those strings with RegEx and prompts a categorisation value if the RegEx is met. Example:

A1: "/flights/munich/new-york" categorises as "city to city"
A2: "flights/munich/usa" categorises as "city to country"
A3:  "flights/usa/germany" categorises as "country to country"

Right now, I have a code that will return the "city to city" category to me, but I can´t figure out who to get a code that handles the multiple patterns and returns the corresponding output string.

In short, a logic like this is needed:

If A1 contains RegEx ".*/munich/new-york" then return output string "city to city" , if A1 contains RegEx ".*/munich/usa" then return output string "city to country" and so on.

Guess this has something to to with how to handle multiple if statements with multiple patterns in VBA, but I can´t figure it out.

This is how my code looks right now - hope you can help!

Function simpleCellRegex(Myrange As Range) As String
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace As String
    Dim strOutput As String

    strPattern = "(munich|berlin|new-york|porto|copenhagen|moscow)/(munich|berlin|new-york|porto|copenhagen|moscow)"

    If strPattern <> "" Then
        strInput = Myrange.Value
        strReplace = "CITY TO CITY"

        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        If regEx.Test(strInput) Then
            simpleCellRegex = regEx.Replace(strInput, strReplace)
        Else
            simpleCellRegex = "NO MATCH FOUND"
        End If
    End If

End Function

Like @dbmitch mentions in the comments, you can't do this with a single Regex - you'll need to use 3 of them. I'd personally put the cities and countries into Consts and build the patterns as need. You can then pass them (along with the strReplace ) as parameters to simpleCellRegex function:

Const CITIES As String = "(munich|berlin|new-york|porto|copenhagen|moscow)"
Const COUNTRIES As String = "(germany|france|usa|russia|etc)"

Function simpleCellRegex(Myrange As Range, strReplace As String, strPattern As String) As String
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace As String
    Dim strOutput As String

    If strPattern <> "" Then
        strInput = Myrange.Value

        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        If regEx.test(strInput) Then
            simpleCellRegex = regEx.Replace(strInput, strReplace)
        Else
            simpleCellRegex = "NO MATCH FOUND"
        End If
    End If
End Function

Called like this:

foo = simpleCellRegex(someRange, "CITY TO CITY", CITIES & "/" & CITIES)
foo = simpleCellRegex(someRange, "CITY TO COUNTRY", CITIES & "/" & COUNTRIES)
foo = simpleCellRegex(someRange, "COUNTRY TO COUNTRY", COUNTRIES & "/" & COUNTRIES)

Note: If you're doing this in a loop, it would be wildly more efficient to only build each RegExp once , and then pass that as a parameter instead of the pattern.

A little (maybe) "out of the box" solution:

Option Explicit

Sub main()
    Const CITIES As String = "MUNICH|BERLIN|NEWYORK|PORTO|COPENHAGEN|MOSCOW"
    Const COUNTRIES As String = "USA|GERMANY"

    With Worksheets("FLIGHTS")
        With .Range("A1", .Cells(.Rows.Count, 1).End(xlUp))
            With .Offset(, 1)
                .value = .Offset(, -1).value
                .Replace What:="*flights/", replacement:="", LookAt:=xlPart, MatchCase:=False
                .Replace What:="/", replacement:=" to ", LookAt:=xlPart, MatchCase:=False
                ReplaceElement .Cells, CITIES, "city"
                ReplaceElement .Cells, COUNTRIES, "country"
            End With
        End With
    End With
End Sub

Sub ReplaceElement(rng As Range, whats As String, replacement As String)
    Dim elem As Variant

    With rng
        For Each elem In Split(whats, "|")
            .Replace What:=elem, replacement:=replacement, LookAt:=xlPart, MatchCase:=False
        Next elem
    End With
End Sub

note

  • replace() methods can be taught to ignore cases but beware to have consistency between names: "newyork" will never match "new-york"

I would do this a bit differently. I would make the regex pattern the start or end point, and match it against a comma delimited string of cities or countries.

Given what you have presented, the start and end points will always be the last two / separated units.

So something like:

Option Explicit
Sub CategorizeFlights()
    Dim rData As Range, vData As Variant
    Dim arrCity() As Variant
    Dim arrCountry() As Variant
    Dim I As Long, J As Long
    Dim sCategoryStart As String, sCategoryEnd As String
    Dim V As Variant
    Dim RE As RegExp

arrCity = Array("munich", "newyork")
arrCountry = Array("usa", "germany")

Set RE = New RegExp
With RE
    .Global = False
    .ignorecase = True
End With

Set rData = Range("A2", Cells(Rows.Count, "A").End(xlUp)).Resize(columnsize:=2)
vData = rData

For I = 1 To UBound(vData, 1)
    V = Split(vData(I, 1), "/")

        RE.Pattern = "\b" & V(UBound(V) - 1) & "\b"
        If RE.test(Join(arrCity, ",")) = True Then
                sCategoryStart = "City to "
        ElseIf RE.test(Join(arrCountry, ",")) = True Then
                sCategoryStart = "Country to "
        Else
                sCategoryStart = "Unknown to "
        End If

        RE.Pattern = "\b" & V(UBound(V)) & "\b"
        If RE.test(Join(arrCity, ",")) = True Then
                sCategoryEnd = "City"
        ElseIf RE.test(Join(arrCountry, ",")) = True Then
                sCategoryEnd = "Country"
        Else
                sCategoryEnd = "Unknown"
        End If

    vData(I, 2) = sCategoryStart & sCategoryEnd
Next I

With rData
    .Value = vData
    .EntireColumn.AutoFit
End With

End Sub

As is sometimes the case, a similar algorithm can be used without regular expressions, but I assume this is an exercise in its use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM