简体   繁体   English

VBA/Excel - 计算每个单元格中有多个单词的列中的唯一单词

[英]VBA/Excel - Count unique words in columns with multiple words in each cell

I am working with the below dataset.我正在使用以下数据集。 For the each company I would like to understand how many different product they order.对于每家公司,我想了解他们订购了多少种不同的产品。

For example: company 'AAA' order 6 different products (Product 1,2,3,4,5,7).例如:公司“AAA”订购了 6 种不同的产品(产品 1、2、3、4、5、7)。

例子

Not sure, if we need to split words in each column and later count one by one in the loop or is there any faster method?不确定,如果我们需要在每列中拆分单词,然后在循环中一一计数,或者有什么更快的方法? I have to use VBA here, and my dataset is more than 100k.这里必须使用VBA,我的数据集超过100k。

You could maybe piece something together using, assuming data in A1:C?假设数据在A1:C? ,您可以使用拼凑一些东西A1:C? :

Sub Test()

Dim arr As Variant
Dim lr As Long, x As Long, y As Long
Dim dict1 As Object: Set dict1 = CreateObject("Scripting.Dictionary")
Dim dict2 As Object: Set dict2 = CreateObject("Scripting.Dictionary")

'Get initial array (NOTE: implicit reference to the active worksheet)
lr = Cells(Rows.Count, "A").End(xlUp).Row
arr = Range("A2:C" & lr)

'Loop through array and fill dictionary
For x = LBound(arr) To UBound(arr)
    dict1(arr(x, 1)) = dict1(arr(x, 1)) & "," & arr(x, 3)
Next

'Loop through dictionary and count unique items
For y = 0 To dict1.Count - 1
    For Each el In Split(dict1.Items()(y), ",")
        dict2(el) = 1
    Next
    dict1(dict1.keys()(y)) = dict2.Count - 1
    dict2.RemoveAll
    
    'Check the result
    Debug.Print dict1.keys()(y) & "-" & dict1.Items()(y)
Next

End sub

This answer might seem very silly, but as you are separating the different products with a comma, why not simply count the amount of commas and add 1, something like:这个答案可能看起来很愚蠢,但是当您用逗号分隔不同的产品时,为什么不简单地计算逗号的数量并加 1,例如:

=SEARCH(",",C2,1)+1

Once you have this in a helper column, you can use Excel's basic Subtotals feature for finding the sum per customer.一旦您在辅助列中找到了它,您就可以使用 Excel 的基本Subtotals功能来查找每个客户的总和。

Please, test the next code.请测试下一个代码。 It will return (in the above code in the next sheet, but it can return in any sheet) the unique client, followed by total products count and in the next columns the ordered products:它将返回(在下一个工作表中的上述代码中,但它可以在任何工作表中返回)唯一客户,然后是产品总数,在下一列中是订购的产品:

Sub ProductsPerClient()
    Dim sh As Worksheet, sh1 As Worksheet, lastR As Long, arr, arrSpl, arrFin, colMax As Long
    Dim i As Long, j As Long, dict As Object
    
    Set sh = ActiveSheet
    Set sh1 = sh.Next 'use here the sheet you need
    lastR = sh.Range("A" & sh.rows.count).End(xlUp).row
    arr = sh.Range("A2:C" & lastR).value
    Set dict = CreateObject("Scripting.Dictionary")
    For i = 1 To UBound(arr)
        arrSpl = Split(Trim(arr(i, 3)), ",")
        If Not dict.Exists(arr(i, 1)) Then
            dict.Add arr(i, 1), Join(arrSpl, "|")
            If UBound(arrSpl) + 1 > colMax Then colMax = UBound(arrSpl) + 1
        Else
            dict(arr(i, 1)) = dict(arr(i, 1)) & "|" & Join(arrSpl, "|")
            If UBound(Split(dict(arr(i, 1)), "|")) + 1 > colMax Then colMax = UBound(Split(dict(arr(i, 1)), "|")) + 1
        End If
    Next i
    ReDim arrFin(1 To dict.count, 1 To colMax + 2)

    For i = 0 To dict.count - 1
        arrFin(i + 1, 1) = dict.Keys()(i)
        arrSpl = Split(dict.items()(i), "|")
        arrFin(i + 1, 2) = UBound(arrSpl) + 1
        For j = 0 To UBound(arrSpl)
            arrFin(i + 1, j + 3) = arrSpl(j)
        Next j
    Next i
    'drop the final array content:
    sh1.Range("A2").Resize(UBound(arrFin), UBound(arrFin, 2)).value = arrFin
 End Sub

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM