简体   繁体   中英

Excel VBA: Splitting a big string with no good delimiters

Dear all,

I need to split a large string from a single cell, with no good delimiters. It's a 'point-by-point' date from a tennis match, exported directly to an Excel workbook from a third-party software.

Unfortunately, I do not know the VBA language enough to solve this by my own, and I could not find a similar example here in the forum. So, can some blessed soul help me, please?

This is an example of the content of my A1 cell:

0-0 [*0-0] [0-15*] [15-15*] [15-30*] [30-30*] [40-30*] [40-40*] [40-A*] [40-40*] [A-40*] 1-0 [*0-0] [*0-15] [*15-15] [*15-30] [*30-30] [*40-30] 2-0 [0-0*] [15-0*] [30-0*] [30-15*] [40-15*] 3-0 [*0-0] [*0-15] [*15-15] [*30-15] [*40-15] 4-0 [0-0*] [15-0*] [30-0*] [40-0*] 5-0 [*0-0] [*15-0] [*15-15] [*30-15] [*40-15] 6-0 0-0 [0-0*] [0-15*] [0-30*] [0-40*] 6-0 0-1 [*0-0] [*0-15] [*15-15] [*15-30] [*30-30] [*30-40] [*40-40] [*A-40] 6-0 1-1 [0-0*] [0-15*] [15-15*] [30-15*] [30-30*] [40-30*] 6-0 2-1 [*0-0] [*15-0] [*15-15] [*15-30] [*30-30] [*40-30] [*40-40] [*A-40] [*40-40] [*A-40] [*40-40] [*40-A] 6-0 2-2 [0-0*] [0-15*] [0-30*] [15-30*] [15-40*] 6-0 2-3 [*0-0] [*0-15] [*0-30] [*0-40] 6-0 2-4 [0-0*] [0-15*] [0-30*] [0-40*] 6-0 2-5 [*0-0] [*15-0] [*30-0] [*30-15] [*40-15] 6-0 3-5 [0-0*] [0-15*] [0-30*] [15-30*] [30-30*] [40-30*] 6-0 4-5 [*0-0] [*15-0] [*30-0] [*40-0] 6-0 5-5 [0-0*] [0-15*] [15-15*] [30-15*] [30-30*] [30-40*] [40-40*] [40-A*] 6-0 5-6 [*0-0] [*15-0] [*30-0] [*30-15] [*40-15] [*40-30] 6-0 6-6 [0-0*] [*1-0] [*2-0] [2-1*] [3-1*] [*4-1] [*5-1] [6-1*] 6-0 7-6(1)
  • The * indicates who is serving.
  • The numbers inside the brackets are the points inside each game or in a tiebreak.
  • The numbers outside the brackets are the final score of each game.
  • After the end of the first set (6-X or 7-5), the numbers outside the brackets include the previously set scores.

Important: The first characters, before the first real point [0-15*] , are useless, IMO. First, because the indication of who is serving is usually wrong (like in this example); Second, because sometimes the string starts a little different, without the first "0-0" or with some other useless zeros, like "0-0 [0-0] [* 0-0]" .

That said, what I need extract from this data are only two things:

  • A column saying who served in the first game (left player or right player);
  • The sequence only of the games scores (without the point-by-point) in different columns;

Like this:

*1-0 | 1-1 | 2-1 | 3-1 | 4-1 ...*

I already did this using Excel formulas, but I needed dozens of new columns, each one with big inefficient formulas, what is making it impossible to process in Excel.

Is there a easiest way to do this using VBA Excel? Do I have to use another software or language, like R or Power Bi?

Based on the example you've provided, below UDF will give you an array ( aGameScore ) with set details. Number of elements of the array correspond to the number of sets listed in the string. Each element starts with either Left Player or Right Player : indicating which player served first in the set. After that, each array element holds the score of each game listed in the string

Sub GetScores()

    Dim aScores As Variant
    Dim aGameScore As Variant
    Dim iC&, iHyphLoc&, iServerLoc&
    Dim sServer$

    With ThisWorkbook.Worksheets("Sheet13")

        aScores = Split(.Range("A2"), " ")

        For iC = LBound(aScores) To UBound(aScores)
            If InStr(1, aScores(iC), "0-0") > 0 And InStr(1, aScores(iC), "[") = 0 Then
                ' Set who sereved in the first game
                iC = iC + 1
                If IsArray(aGameScore) Then
                    ReDim Preserve aGameScore(UBound(aGameScore) + 1)
                Else
                    ReDim aGameScore(0)
                End If
                iHyphLoc = InStr(1, aScores(iC), "-")
                iServerLoc = InStr(1, aScores(iC), "*")
                If iHyphLoc > iServerLoc Then
                    aGameScore(UBound(aGameScore)) = "Left Player"
                Else
                    aGameScore(UBound(aGameScore)) = "Right Player"
                End If
            ElseIf InStr(1, aScores(iC), "[") = 0 Then
                ' Capture game scores
                If iC = UBound(aScores) Then
                    aGameScore(UBound(aGameScore)) = aGameScore(UBound(aGameScore)) & " | " & Trim(aScores(iC))
                ElseIf InStr(1, aScores(iC + 1), "[") <> 0 Then
                    aGameScore(UBound(aGameScore)) = aGameScore(UBound(aGameScore)) & " | " & Trim(aScores(iC))
                End If
            End If
        Next

    End With

End Sub

At the moment the UDF is only checking the text in cell A2 of Worksheet(13) . You can further modify this to look at all the cells in your range

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM