I have a list of strings in excel as such:
a>b>b>d>c>a
a>b>c>d
b>b>b>d>d>a
etc.
I want to extract the last c or last d from each string whichever comes last,
eg
a>b>b>d>c>a = C
a>b>c>d = d
b>b>b>d>d>a = d
how would I do this using VBA (or just straight excel if it is possible)?
You could use an excel formula as follows
To help explain will start with just one letter then will show full formula at the end.
First find the number of occurences of c
= LEN(A1) - LEN(SUBSTITUTE(A1,"c","")
Use this position to replace the last c with a unique character ($ as an example)
=SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c","")))
Next find this unique character
= FIND("$",SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c",""))))
This gives the position of the last c, now you can use this in a mid function to return this last c
= MID(A1,FIND("$",SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c","")))),1)
Finally to account for both c and d, use a max to bring back which comes last
= MID(A1,MAX(IFERROR(FIND("$",SUBSTITUTE(A1,"c","$",LEN(A1) - LEN(SUBSTITUTE(A1,"c","")))),0),IFERROR(FIND("$",SUBSTITUTE(A1,"d","$",LEN(A1) - LEN(SUBSTITUTE(A1,"d","")))),0)),1)
Assuming c/d are just examples:
?LastEither("b>b>b>d>d>a", "c", "d")
d
Using
Function LastEither(testStr As String, find1 As String, find2 As String) As String
Dim p1 As Long: p1 = InStrRev(testStr, find1)
Dim p2 As Long: p2 = InStrRev(testStr, find2)
If (p1 > p2) Then
LastEither = find1
ElseIf (p2 > 0) Then LastEither = find2
End If
End Function
General solution:
?FindLastMatch("b>b>b>d>d>a>q>ZZ", ">", "c", "d")
d
?FindLastMatch("b>b>b>d>d>a>q>ZZ", ">", "c", "d", "q")
q
?FindLastMatch("b>b>b>d>d>a>q>ZZ>ppp", ">", "c", "d", "ZZ", "q")
ZZ
Using
Function FindLastMatch(testStr As String, delimiter As String, ParamArray findTokens() As Variant) As String
Dim tokens() As String, i As Long, j As Long
tokens = Split(testStr, delimiter)
For i = UBound(tokens) To 0 Step -1
For j = 0 To UBound(findTokens)
If tokens(i) = findTokens(j) Then
FindLastMatch = tokens(i)
Exit Function
End If
Next
Next
End Function
And here is a array formula to do the same thing. (Changed formula to avoid problem with original pointed out by Grade 'Eh' Bacon)
=MID(A1,MAX((MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1)={"c","d"})*ROW(INDIRECT("1:"&LEN(A1)))),1)
An array formula is entered by holding down ctrl+shift
while hitting enter
. If you do it correctly, Excel will place braces {...}
around the formula which you can see in the formula bar.
The formula will return a #VALUE!
error if there is neither c
nor d
in the string.
EDIT: Having seen from some of your comments that you might want to use more than single character words, I present the following User Defined Function. It allows you to use words of any length, and also you are not limited to just two words -- you can use an arbitrary number of words.
You would enter a formula such as:
=LastOne(A8,"Charlie","Delta")
or
=LastOne(A8,$I1:$I2)
where I1 and I2 contain the words you wish to check for.
The words need to be separated by some delimiter that is neither a letter nor a digit.
A Regular Expression (regex) is constructed which consists of a pipe-separated |
list of the words or phrases. The pipe |
, in a regex, is the same as an OR
. The \\b
at the beginning and end of the regex indicates a word boundary -- that is the point at which a digit or letter is adjacent to a non-digit or non-letter, or the beginning or end of the string. Hence the actual delimiter does not matter, so long as it is not a letter or digit.
All of the matches are placed in a Match Collection; and we only need to look for the last item in the match. There will be MC.Count
matches and, since this count is zero based, we subtract one to get the last match. Here is the code:
===========================================
Option Explicit
Function LastOne(sSearch As String, ParamArray WordList() As Variant) As String
Dim RE As Object, MC As Object
Dim sPat As String
Dim RNG, C
For Each RNG In WordList
If IsArray(RNG) Or IsObject(RNG) Then
For Each C In RNG
sPat = sPat & "|" & C
Next C
Else
sPat = sPat & "|" & RNG
End If
Next RNG
sPat = "\b(?:" & Mid(sPat, 2) & ")\b"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = sPat
.ignorecase = True
If .test(sSearch) = True Then
Set MC = .Execute(sSearch)
LastOne = MC(MC.Count - 1)
End If
End With
End Function
===========================================
Here is a sample screenshot:
Note that an absence of a WordList word will result in a blank cell. One could produce an error if that is preferable.
Assuming your string is in cell A1, and there are no uses of the tilde (~) character in it, you can use the following in a worksheet:
=IF(IFERROR(FIND("~",SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c","")))),0)>IFERROR(FIND("~",SUBSTITUTE(A1,"d","~",LEN(A1)-LEN(SUBSTITUTE(A1,"d","")))),0),"c","d")
EDIT:
In response to a comment, here's an explanation of how this works. I've also neatened up the formula slightly having looked back at it again. The two formulae for c
and d
are identical, so the explanation will apply for both. So, working outwards
LEN(A1)-LEN(SUBSTITUTE(A1,"c",""))
Here we remove all instances of c
from the string. By comparing the length of this calculated string and the original string, we calculate the number of times c
appears in the original string.
SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c","")))
Now that we know the number of times c
appears in our string, we replace the last occurrence of c
with the tilde character (here we assume the tilde isn't used in the string otherwise).
FIND("~",SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c",""))))
We then find the position of the tilde in the string, which is equivalent to the position of the last c
in the string.
IFERROR(FIND("~",SUBSTITUTE(A1,"c","~",LEN(A1)-LEN(SUBSTITUTE(A1,"c","")))),0)
Wrapping this in an IFERROR
ensures that we don't have errors coming through the formula - setting the value to 0 if no c
exists ensures that we still get a correct answer if our string contains c
but not d
(and vice versa).
We then apply the same calculation to d
and compare the two to see which occurs later in our string. Note: this will give an incorrect answer if there is neither c
nor d
in the string.
In VBA you can do this using following simple logic.
Dim str As String
str = "a>b>b>d>c>a"
Dim Cet
Cet = split(str,">")
Dim i as Integer
For i= Ubound(Cet) to Lbound(Cet)
If Cet(i) = "c" or "d" or "C" or "D" then
MsgBox Cet(i)
Exit For
End if
Next i
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.