简体   繁体   中英

How to create an array from calculations of another array excel vba

I am new to coding and trying to learn through VBA. What I am trying to do is calculate outliers in a data set following a procedure. My trouble is trying to identify the elements in the Data Set that are furthest from the mean (the outlier) and looping that k times. Most of the code is very messy as I have been trying to find out what is wrong so ignore the MsgBox's and ugly formatting. In the last part of my code I tried taking the elements from DataSet and subtracting them from the mean and storing those values in a new array. After that I would take absolute value of the elements in the Diff array and store them in a new array (Diff2). I know I could bypass Diff2 by just taking the absolute value of the calculation of Diff. When I run the code I get the type mismatch error and after some investigation i realized that Diff (and Diff2) are not arrays. If anyone knows of how I can make Diff an array or of a better workaround for this that would be much appreciated!

Sub CalculateOutliers()
    Dim n As Integer
    Dim mean As Double
    Dim SD As Double
    Dim X As Integer
    Dim k As Integer
    Dim DataSet As Variant
    Dim ESDPrin As Double

    DataSet = Selection.Value
    'Copies highlighted data into DataSet variable
    'Cell A1 is (1,1) Because it starts at 0 which is out of range

    n = Selection.CountLarge
    'Counts number of entries
    'If n < 20 Then
        'MsgBox "Data set too small"
        'Exit Sub
    'End If
    'Ends Subroutine if data set is too small for this analysis

    If n < 50 Then
        k = Int(n / 10)
    Else
        k = 5
    End If
    'determines k = number of possible outliers

    mean = Application.WorksheetFunction.Average(DataSet)
    'Calculates mean of Data Set
    MsgBox mean & "Average"

    SD = Application.WorksheetFunction.StDev(DataSet)
    'Calculates Standard Deviation of Data Set

    Dim element As Variant
    Dim Diff As Variant

    For Each element In DataSet
        Diff = element - mean
        MsgBox Diff & " Difference"
    Next element

    Dim P As Integer
    Dim Outlier As Integer
    Dim Diff2 As Variant

    Diff2 = Abs(Diff)

    For P = 1 To k
        Outlier = UBound(Diff, 1)
        MsgBox Outlier
   Next P
End Sub

Here how you create the Diff Array with size n

ReDim Diff(1 To n) As Double
Dim i As Long
For Each element In DataSet
    i = i + 1
    Diff(i) = element - mean
Next element

However, I don't think that this is the correct way to go. There's no need for a Diff array. What you should do is, once you have calculated the mean and SD , iterate on the DataSet array itself, check for each element its absolute difference with mean, divide by stdev, and compare this ratio to some threshold (say 2 or 3) to decide whether this element is an outlier, in which case you print it out as an outlier. Something like this:

For Each element In DataSet
   If abs(element - mean) / SD > 3 Then Debug.Print "outlier: " & element
Next element

I think code would be like this

Sub CalculateOutliers()
    Dim n As Integer
    Dim mean As Double
    Dim SD As Double
    Dim X As Integer
    Dim k As Integer
    Dim DataSet As Variant
    Dim ESDPrin As Double

    DataSet = Selection.Value
    'Copies highlighted data into DataSet variable
    'Cell A1 is (1,1) Because it starts at 0 which is out of range

    n = Selection.CountLarge
    'Counts number of entries
    'If n < 20 Then
        'MsgBox "Data set too small"
        'Exit Sub
    'End If
    'Ends Subroutine if data set is too small for this analysis

    If n < 50 Then
        k = Int(n / 10)
    Else
        k = 5
    End If
    'determines k = number of possible outliers

    mean = Application.WorksheetFunction.Average(DataSet)
    'Calculates mean of Data Set
    MsgBox mean & "Average"

    SD = Application.WorksheetFunction.StDev(DataSet)
    'Calculates Standard Deviation of Data Set

    Dim element As Variant
    Dim Diff() As Variant, Diff2() As Variant, j As Integer

    For Each element In DataSet
        j = j + 1
        ReDim Preserve Diff(1 To j): ReDim Preserve Diff2(1 To j)
        Diff(j) = element - mean
        Diff2(j) = Abs(Diff(j))
        MsgBox Diff(j) & " Difference"
        MsgBox Diff2(j) & " Difference abs "
    Next element
    MsgBox UBound(Diff)
    'Dim P As Integer
    'Dim Outlier As Integer
    'Dim Diff2 As Variant

End Sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM