Problem: Looking for a more efficient way of finding whether there is an exact matching value in a 1d array -- essentially a boolean true/false
.
Am I overlooking something obvious? Or am I simply using the wrong data structure, by using an array when I probably should be using a collection object or a dictionary? In the latter I could check the .Contains
or .Exists
method, respectively
In Excel I can check for a value in a vector array like:
If Not IsError(Application.Match(strSearch, varToSearch, False)) Then
' Do stuff
End If
This returns an exact match index, obviously subject to limitations of Match
function which only finds the first matching value in this context. This is a commonly used method, and one that I have been using for a long time, too.
This is satisfactory enough for Excel -- but what about other applications?
In other applications, I can do basically the same thing but requires enabling reference to the Excel object library, and then:
If Not IsError(Excel.Application.match(...))
That seems silly, though, and is difficult to manage on distributed files because of permissions/trust center/etc.
I have tried to use the Filter() function:
If Not Ubound(Filter(varToSearch, strSearch)) = -1 Then
'do stuff
End If
But the problem with this approach is that Filter
returns an array of partial matches, rather than an array of exact matches. (I have no idea why it would be useful to return substring/partial matches.)
The other alternative is to literally iterate over each value in the array (this also is very commonly used I think) -- which seems even more needlessly cumbersome than calling on Excel's Match
function.
For each v in vArray
If v = strSearch Then
' do stuff
End If
Next
If we're going to talk about performance then there's no substutute for running some tests. In my experience Application.Match() is up to ten times slower than calling a function which uses a loop.
Sub Tester()
Dim i As Long, b, t
Dim arr(1 To 100) As String
For i = 1 To 100
arr(i) = "Value_" & i
Next i
t = Timer
For i = 1 To 100000
b = Contains(arr, "Value_50")
Next i
Debug.Print "Contains", Timer - t
t = Timer
For i = 1 To 100000
b = Application.Match(arr, "Value_50", False)
Next i
Debug.Print "Match", Timer - t
End Sub
Function Contains(arr, v) As Boolean
Dim rv As Boolean, lb As Long, ub As Long, i As Long
lb = LBound(arr)
ub = UBound(arr)
For i = lb To ub
If arr(i) = v Then
rv = True
Exit For
End If
Next i
Contains = rv
End Function
Output:
Contains 0.8710938
Match 4.210938
I used to look for a best replace solution. It should work for simple finding as well.
To find first instance of a string you can try using this code:
Sub find_strings_1()
Dim ArrayCh() As Variant
Dim rng As Range
Dim i As Integer
ArrayCh = Array("a", "b", "c")
With ActiveSheet.Cells
For i = LBound(ArrayCh) To UBound(ArrayCh)
Set rng = .Find(What:=ArrayCh(i), _
LookAt:=xlPart, _
SearchOrder:=xlByColumns, _
MatchCase:=False)
Debug.Print rng.Address
Next i
End With
End Sub
If you want to find all instances try the below.
Sub find_strings_2()
Dim ArrayCh() As Variant
Dim c As Range
Dim firstAddress As String
Dim i As Integer
ArrayCh = Array("a", "b", "c") 'strings to lookup
With ActiveSheet.Cells
For i = LBound(ArrayCh) To UBound(ArrayCh)
Set c = .Find(What:=ArrayCh(i), LookAt:=xlPart, LookIn:=xlValues)
If Not c Is Nothing Then
firstAddress = c.Address 'used later to verify if looping over the same address
Do
'_____
'your code, where you do something with "c"
'which is a range variable,
'so you can for example get it's address:
Debug.Print ArrayCh(i) & " " & c.Address 'example
'_____
Set c = .FindNext(c)
Loop While Not c Is Nothing And c.Address <> firstAddress
End If
Next i
End With
End Sub
Keep in mind that if there are several instances of searched string within one cell it will return only one result due to the specific of FindNext.
Still, if you need a code for replacing found values with another, I'd use the first solution, but you'd have to change it a bit.
"A more efficient way (compared to Application.Match
)of finding whether a string value exists in an array":
I believe there is no more efficient way than the one you are using, ie, Application.Match
.
Arrays allow efficient access in any element if we know the index of that element. If we want to do anything by element value (even checking if an element exists), we have to scan all the elements of the array in the worst case. Therefore, the worst case needs n
element comparisons, where n
is the size of the array. So the maximum time we need to find if an element exists is linear in the size of the input, ie, O(n)
. This applies to any language that uses conventional arrays.
The only case where we can be more efficient, is when the array has special structure. For your example, if the elements of the array are sorted (eg alphabetically), then we do not need to scan all the array: we compare with the middle element, and then compare with the left or right part of the array ( binary search) . But without assuming any special structure, there is no hope..
The Dictionary/Collection
as you point, offers constant key access to their elements ( O(1)
). What perhaps is not very well documented is that one can also have index access to the dictionary elements (Keys and Items): the order in which elements are entered into the Dictionary
is preserved. Their main disadvantage is that they use more memory as two objects are stored for each element.
To wrap up, although If Not IsError(Excel.Application.match(...))
looks silly, it is still the more efficient way (at least in theory). On permission issues my knowledge is very limited. Depending on the host application, there are always some Find
-type functions ( C++
has find
and find_if
for example).
I hope that helps!
Edit
I would like to add a couple of thoughts, after reading the amended version of the post and Tim's answer. The above text is focusing on the theoretical time complexity of the various data structures and ignores implementation issues. I think the spirit of the question was rather, "given a certain data structure (array)", what is the most efficient way in practice of checking existence.
To this end, Tim's answer is an eye-opener.
The conventional rule "if VBA
can do it for you then don't write it again yourself" is not always true. Simple operations like looping and comparisons can be faster that "agreegate" VBA
functions. Two interesting links are here and here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.