简体   繁体   中英

VBA Excel Find the last specific instance in a string

I have a list of strings in excel as such:

a>b>b>d>c>a

a>b>c>d

b>b>b>d>d>a

etc.

I want to extract the last c or last d from each string whichever comes last,

eg

a>b>b>d>c>a = C

a>b>c>d     = d

b>b>b>d>d>a = d

how would I do this using VBA (or just straight excel if it is possible)?

You could use an excel formula as follows

To help explain will start with just one letter then will show full formula at the end.

First find the number of occurences of c

= LEN(A1) - LEN(SUBSTITUTE(A1,"c","")

Use this position to replace the last c with a unique character ($ as an example)

=SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c","")))

Next find this unique character

= FIND("$",SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c",""))))

This gives the position of the last c, now you can use this in a mid function to return this last c

= MID(A1,FIND("$",SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c","")))),1)

Finally to account for both c and d, use a max to bring back which comes last

= MID(A1,MAX(IFERROR(FIND("$",SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c","")))),0),IFERROR(FIND("$",SUBSTITUTE(A1,"d","$",LEN(A1) - LEN(SUBSTITUTE(A1,"d","")))),0)),1)

Assuming c/d are just examples:

?LastEither("b>b>b>d>d>a", "c", "d")
d

Using

Function LastEither(testStr As String, find1 As String, find2 As String) As String
    Dim p1 As Long: p1 = InStrRev(testStr, find1)
    Dim p2 As Long: p2 = InStrRev(testStr, find2)
    If (p1 > p2) Then
        LastEither = find1
    ElseIf (p2 > 0) Then LastEither = find2
    End If
End Function

General solution:

?FindLastMatch("b>b>b>d>d>a>q>ZZ", ">", "c", "d")
d
?FindLastMatch("b>b>b>d>d>a>q>ZZ", ">", "c", "d", "q")
q
?FindLastMatch("b>b>b>d>d>a>q>ZZ>ppp", ">", "c", "d", "ZZ", "q")
ZZ

Using

Function FindLastMatch(testStr As String, delimiter As String, ParamArray findTokens() As Variant) As String
    Dim tokens() As String, i As Long, j As Long
    tokens = Split(testStr, delimiter)
    For i = UBound(tokens) To 0 Step -1
        For j = 0 To UBound(findTokens)
            If tokens(i) = findTokens(j) Then
                FindLastMatch = tokens(i)
                Exit Function
            End If
        Next
    Next
End Function

And here is a array formula to do the same thing. (Changed formula to avoid problem with original pointed out by Grade 'Eh' Bacon)

=MID(A1,MAX((MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1)={"c","d"})*ROW(INDIRECT("1:"&LEN(A1)))),1)

An array formula is entered by holding down ctrl+shift while hitting enter . If you do it correctly, Excel will place braces {...} around the formula which you can see in the formula bar.

The formula will return a #VALUE! error if there is neither c nor d in the string.

在此输入图像描述

EDIT: Having seen from some of your comments that you might want to use more than single character words, I present the following User Defined Function. It allows you to use words of any length, and also you are not limited to just two words -- you can use an arbitrary number of words.

You would enter a formula such as:

=LastOne(A8,"Charlie","Delta")

or

=LastOne(A8,$I1:$I2)

where I1 and I2 contain the words you wish to check for.

The words need to be separated by some delimiter that is neither a letter nor a digit.

A Regular Expression (regex) is constructed which consists of a pipe-separated | list of the words or phrases. The pipe | , in a regex, is the same as an OR . The \\b at the beginning and end of the regex indicates a word boundary -- that is the point at which a digit or letter is adjacent to a non-digit or non-letter, or the beginning or end of the string. Hence the actual delimiter does not matter, so long as it is not a letter or digit.

All of the matches are placed in a Match Collection; and we only need to look for the last item in the match. There will be MC.Count matches and, since this count is zero based, we subtract one to get the last match. Here is the code:

===========================================

Option Explicit
Function LastOne(sSearch As String, ParamArray WordList() As Variant) As String
    Dim RE As Object, MC As Object
    Dim sPat As String
    Dim RNG, C

For Each RNG In WordList
    If IsArray(RNG) Or IsObject(RNG) Then
        For Each C In RNG
            sPat = sPat & "|" & C
        Next C
    Else
        sPat = sPat & "|" & RNG
    End If
Next RNG

sPat = "\b(?:" & Mid(sPat, 2) & ")\b"

Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = True
    .Pattern = sPat
    .ignorecase = True
    If .test(sSearch) = True Then
        Set MC = .Execute(sSearch)
        LastOne = MC(MC.Count - 1)
    End If
End With
End Function

===========================================

Here is a sample screenshot:

Note that an absence of a WordList word will result in a blank cell. One could produce an error if that is preferable.

在此输入图像描述

Assuming your string is in cell A1, and there are no uses of the tilde (~) character in it, you can use the following in a worksheet:

=IF(IFERROR(FIND("~",SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c","")))),0)>IFERROR(FIND("~",SUBSTITUTE(A1,"d","~",LEN(A1)-LEN(SUBSTITUTE(A1,"d","")))),0),"c","d")

EDIT:
In response to a comment, here's an explanation of how this works. I've also neatened up the formula slightly having looked back at it again. The two formulae for c and d are identical, so the explanation will apply for both. So, working outwards

LEN(A1)-LEN(SUBSTITUTE(A1,"c",""))

Here we remove all instances of c from the string. By comparing the length of this calculated string and the original string, we calculate the number of times c appears in the original string.

SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c","")))

Now that we know the number of times c appears in our string, we replace the last occurrence of c with the tilde character (here we assume the tilde isn't used in the string otherwise).

FIND("~",SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c",""))))

We then find the position of the tilde in the string, which is equivalent to the position of the last c in the string.

IFERROR(FIND("~",SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c","")))),0)

Wrapping this in an IFERROR ensures that we don't have errors coming through the formula - setting the value to 0 if no c exists ensures that we still get a correct answer if our string contains c but not d (and vice versa).

We then apply the same calculation to d and compare the two to see which occurs later in our string. Note: this will give an incorrect answer if there is neither c nor d in the string.

In VBA you can do this using following simple logic.

Dim str As String
str = "a>b>b>d>c>a"
Dim Cet

Cet = split(str,">")
Dim i as Integer

For i= Ubound(Cet) to Lbound(Cet)
     If Cet(i) = "c" or "d" or "C" or "D" then
         MsgBox Cet(i)
     Exit For
     End if
Next i

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM