TL;DR summary: I want a formula that will find the Nth " _
" (for any N) in a string, and return its index; OR to find the Nth substring, separated by " _
". I have VBA to do this, but it's slow.
Long version: I am working with advertising campaign data. My marketers (fortunately) use a consistent naming scheme for their campaigns. Unfortunately, it's very long.
The campaign names contain exactly 1 piece of data that I cannot otherwise get from reports.
For reference, campaign names are of the format:
ADV_CO_BG_Product_UniqueID_XX_mm.dd.yyyy_mm.dd.yyyy_TYP_NUM
... and I have a column of about 200K of them (growing by a couple hundred each week).
Edit:
The important part is that there are multiple parts of the campaign name, with _
as a delimiter between them. In this case, I want the 9th part, but i want an option that is flexible enough that I don't have to add or remove lines to change which part I target.
I've seen on other questions to use a nested formula like:
=MID(
Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign],
FIND("_",Data_OLV[@Campaign])+1)
+1)
+1)
+1)
+1)
+1)
+1)
+1,
3)
... but that is hard to modify if I need something in a different position.
I have a UDF called StringSplit (see below) that provides the desired results, but it's extremely slow (and only works if you enable macros, which not all of my audience does).
Is there a better way to do what I'm trying to do?
Public Function StringSplit(input_ As String, delimiter_ As String, index_ As Integer)
On Error GoTo err
out = Split(input_, delimiter_, -1, vbTextCompare)
StringSplit = out(index_ - 1)
Exit Function
err:
If err.Number = 9 Then
StringSplit = CVErr(xlErrRef)
Exit Function
End If
StringSplit = err.Description
End Function
I think this is the formula you are looking for -
=MID(A2, FIND(CHAR(1), SUBSTITUTE(A2, B2, CHAR(1), C2))+1, FIND(CHAR(1), SUBSTITUTE(A2, B2, CHAR(1), C2+1)) - FIND(CHAR(1), SUBSTITUTE(A2, B2, CHAR(1), C2))-1)
This is how to do it -
Here B2
is the Delimiter type
and C2
is the Nth occurrence of the Delimiter
. you can modify the code as per your need. Just change the B2
& C2
.
If, for example, you want to locate the third instance of ? in cell A1 , try:
=FIND(CHAR(1),SUBSTITUTE(A1,"?",CHAR(1),3))
NOTE:
We assume that CHAR(1)
does not appear in the original string.
To get the last instance, use:
=FIND(CHAR(1),SUBSTITUTE(A1,"?",CHAR(1),(LEN(A1)-LEN(SUBSTITUTE(A1,"?","")))))
You're saying, if I am correct, that the data you receive is always in format you posted and that you consistently want to extract the TYP data.
Why not search for TYP
in the string, and additionally search for NUM
as that indicates the following subdata?
Then, you would end up with a formula such as
=TRIM(MID(W20,SEARCH("TYP",W20),SEARCH("NUM",W20)-SEARCH("TYP",W20)))
In this formula, cell W20
holds the entire data-string. Naturally you can edit this range or instead paste the whole string in its place.
EDIT
Since OP mentioned the title strings are not consistent:
=TRIM(MID(W20,SEARCH(A1,W20),IF(A2="",LEN(W20),SEARCH(A2,W20)-SEARCH(A1,W20))))
In cell A1
would be the title string of the data that has to be extracted, in this case being TYP
In cell A2
would be the title string of the next subdata. If empty, the formula returns all characters found from the first SEARCH
function using cell A1
.
As Egan Wolf commented, there is a solution at http://exceljet.net/formula/find-nth-occurrence-of-character =MID([@[Campaign]],FIND(CHAR(160),SUBSTITUTE([@[Campaign]],"_",CHAR(160),9))+1,4)
Or, more generally: =MID(TextToSearch,FIND(CHAR(160),SUBSTITUTE(TextToSearch,Delimiter,CHAR(160),InstanceNumber ))+1,LengthOfDesiredSection)
LengthOfDesiredSection
can, of course, by found with a subsection of the first formula, like so (line breaks added for clarity):
=MID(TextToSearch,
FIND(CHAR(160),SUBSTITUTE(TextToSearch,Delimiter,CHAR(160),InstanceNumber))+1,
IFERROR(
(FIND(CHAR(160),SUBSTITUTE(TextToSearch,Delimiter,CHAR(160),InstanceNumber+1)-
FIND(CHAR(160),SUBSTITUTE(TextToSearch,Delimiter,CHAR(160),InstanceNumber)))-1,
LEN(TextToSearch)-
FIND(CHAR(160),SUBSTITUTE(TextToSearch,Delimiter,CHAR(160),InstanceNumber))))
The IFERROR()
protects against situations where the Delimiter
only appears InstanceNumber
times in the TextToSearch
.
One way to find the nth instance of an underscore delimited string, and return that sub-string , is with this formula:
=TRIM(MID(SUBSTITUTE(A1,"_",REPT(" ",999)),MAX(1,999*(n-1)),999))
where n
is the instance you are looking for.
But, of course, this requires that the elements are present in the same order, and are always present (or replaced by an underscore if they are not).
If you are using a version of Excel with the FILTERXML
function, you can use this formula:
=INDEX(FILTERXML("<t><s>" & SUBSTITUTE(A1,"_","</s><s>") & "</s></t>","//s"),n)
Not sure which one would be more efficient (faster) on a large database
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.