简体   繁体   English

使用VBA清除具有多个定界符的数据

[英]Clean data having multiple delimiters using vba

I have a list of manually entered ticket numbers with different delimiters and some strings. 我有一个带有不同定界符和一些字符串的手动输入的票号列表。 I am trying to clean it to just have the ticket numbers starting with "AK1" in a single string separated with commas. 我正在尝试对其进行清洁,以使其票号仅以“ AK1”开头,并以逗号分隔。

在此处输入图片说明

数据

Multi Split 多分割

Dirty Version 脏版

In this version all the strings are split and written using the principle one cell to one (another) cell. 在此版本中,所有字符串都被分割并使用一个单元原则写入一个(另一个)单元。

Sub MultiSplit()

    Const cDel As String = ";,/"  ' Delimiter List
    Const cCol1 As Variant = "A"  ' Source Column Letter/Number
    Const cCol2 As Variant = "B"  ' Target Column Letter/Number
    Const cDelR As String = ","   ' Replace Delimiter
    Const cFirstR As Long = 1     ' First Row Number

    Dim vntR As Variant   ' Range Array
    Dim vntD As Variant   ' Delimiter Array

    Dim LastR As Long     ' Last Row Number
    Dim i As Long         ' Range Array Row Counter
    Dim j As Long         ' Delimiter Array Row Counter

    ' Calculate Last Row Number.
    LastR = Cells(Rows.Count, cCol1).End(xlUp).Row

    ' Copy Source Range into Range Array.
    vntR = Range(Cells(cFirstR, cCol1), Cells(LastR, cCol1))

    ' Split Delimiter List into Delimiter Array
    vntD = Split(cDel, ",")

    ' Calculate values in Range Array.
    For i = 1 To UBound(vntR) ' Range Array
        For j = 0 To UBound(vntD) ' Delimiter Array
            ' Replace by overwriting.
            vntR(i, 1) = Replace(vntR(i, 1), vntD(j), cDelR)
        Next
    Next

    ' Copy Range Array to Target Range.
    Range(Cells(cFirstR, cCol2), Cells(LastR, cCol2)) = vntR

End Sub

Clean One String Version 清理一个字符串版本

If you want all the AK1 tickets in a single cell then use the following code. 如果要将所有AK1票证都放在一个单元格中,请使用以下代码。 Adjust cDelC (the final delimiter) to fit your needs ( eg aa,aa or aa, aa). 调整cDelC(最终定界符)以适合您的需求(例如,aa,aa或aa,aa)。

Sub MultiSplit2()

    Const cDel As String = ";,/"     ' Delimiter List
    Const cCol1 As Variant = "A"     ' Source Column Letter/Number
    Const cCol2 As Variant = "B"     ' Target Column Letter/Number
    Const cDelR As String = ","      ' Replace Delimiter
    Const cFirstR As Long = 1        ' First Row Number
    Const cDelC As String = ", "     ' Clean Delimiter
    Const cString As String = "AK1"  ' Desired Start String

    Dim vntR As Variant   ' Range Array
    Dim vntD As Variant   ' Delimiter Array
    Dim vntT As Variant   ' Temporary Array

    Dim LastR As Long     ' Last Row Number
    Dim i As Long         ' Range Array Row Counter
    Dim j As Long         ' Delimiter Array Row Counter
    Dim strT As String    ' Target String


    ' Calculate Last Row Number.
    LastR = Cells(Rows.Count, cCol1).End(xlUp).Row

    ' Copy Source Range into Range Array.
    vntR = Range(Cells(cFirstR, cCol1), Cells(LastR, cCol1))

    ' Split Delimiter List into Delimiter Array
    vntD = Split(cDel, ",")

    ' Calculate values in Range Array.
    For i = 1 To UBound(vntR) ' Range Array
        For j = 0 To UBound(vntD) ' Delimiter Array
            ' Replace by overwriting.
            vntR(i, 1) = Replace(vntR(i, 1), vntD(j), cDelR)
        Next
        Debug.Print vntR(i, 1)
    Next

    ' Clean the strings in Range Array.
    For i = 1 To UBound(vntR)
        vntT = Split(vntR(i, 1), cDelR)
        For j = 0 To UBound(vntT)
            If Left(Trim(vntT(j)), Len(cString)) = cString Then
                If strT <> "" Then
                    strT = strT & cDelC & Trim(vntT(j))
                  Else
                    strT = Trim(vntT(j))
                End If
            End If
        Next
    Next

    ' Copy Target String to Target Cell.
    Cells(cFirstR, cCol2) = strT

End Sub

I suggest to do this with a UDF (User Defined Function. Install the code below in a standard code module (Press Alt+F11 to open the VB Editor window. Right-click on the VBA Project in the Project explorer window on the left, Select Insert > Module, and paste the code in the empty code panel on the right). Remember to save the workbook in xlsm (macro enabled) format. 我建议使用UDF(用户定义的函数)执行此操作。将以下代码安装在标准代码模块中(按Alt + F11打开VB编辑器窗口。右键单击左侧“项目资源管理器”窗口中的VBA项目,选择“插入”>“模块”,然后将代码粘贴到右侧的空白代码面板中。请记住以xlsm(启用宏)格式保存工作簿。

Function ExtractAK1(Cell As Range) As String

    Const AK1 As String = "AK1-"

    Dim Var As Variant
    Dim Sp() As String
    Dim i As Integer

    Var = Cell.Value
    If VarType(Var) = vbString Then
        If InStr(1, Var, AK1, vbTextCompare) Then
            Sp = Split(Trim(Var), AK1)
            For i = 1 To UBound(Sp)
                Sp(i) = AK1 & Left(Trim(Sp(i)), 5)
            Next i
            Var = Join(Sp, ",")
            ExtractAK1 = Mid(Var, InStr(Var, ",") + 1)
        End If
    End If
End Function

Call the function in the worksheet like you would call a built-in Excel function, for example, 像调用内置Excel函数一样,在工作表中调用该函数,例如,

=ExtractAK1($A2)

If properly installed, Excel will suggest the function's name when you start typing it. 如果安装正确,当您开始键入该函数时,Excel会建议该函数的名称。 $A2 is the cell containing your text. $ A2是包含您的文本的单元格。 Copy the formula down for as long as needed. 根据需要向下复制公式。 You can re-purpose the function for use in a loop if that is the more convenient way of using it. 如果这是更方便的使用方法,则可以将其重新用于循环中。

The following UDF will extract whatever you input into a comma delimited list of AK ticket numbers only. 以下UDF会将您输入的任何内容提取到仅以逗号分隔的AK票证编号列表中。 It is assumed that the ticket number pattern is AK- followed by digits only, which is what you show. 假设您显示的票证号码格式为AK-后接数字。 And only the ticket numbers are extracted, also what you say you want. 而且仅提取票证编号,也提取您想要的内容。

  • If you input a single string, or a single cell, those contents will come out. 如果输入单个字符串或单个单元格,则将显示这些内容。
  • If you input a range of cells, they will be combined into a single output string. 如果输入一定范围的单元格,它们将被合并为一个输出字符串。
  • VBA's Regular Expression engine is used to extract the ticket numbers VBA的正则表达式引擎用于提取票证编号

Option Explicit
  Public RE As Object
  Public MC As Object
  Public M As Object

    'Assume starts with AK- and ends with numbers
    '  as per your example
 Public Const sPat As String = "\bAK1-\d+"

Function getAK(vIN As Variant) As String
    Dim V As Variant
    Dim sTemp As String

Set RE = CreateObject("vbscript.regexp")
With RE
    .Pattern = sPat
    .ignorecase = False
    .Global = True
End With

If IsArray(vIN) Then
    For Each V In vIN
        sTemp = sTemp & "," & getStrOnly(CStr(V))
    Next V
Else
    getAK = getStrOnly(CStr(vIN))
    Exit Function
End If

getAK = Mid(sTemp, 2)

End Function

Private Function getStrOnly(str As String) As String
    Dim sTemp As String
    With RE
        If .test(str) = True Then
            Set MC = .Execute(str)
            For Each M In MC
                sTemp = sTemp & "," & M
            Next M
        End If
    End With
    getStrOnly = Mid(sTemp, 2)
End Function

Using the single cell version: getAK(A1) : 使用单单元格版本: getAK(A1)

在此处输入图片说明

Using the multiple cell method: 使用多单元格方法:

=getAK(A1:A12)

we get 我们得到

AK1-97760,AK1-96767,AK1-97719,AK1-97999,AK1-98105,,AK1-97113,AK1-97073,AK1-97019,AK1-97951,AK1-97858,AK1-97195,AK1-96806,AK1-97719,AK1-97896,AK1-98115,AK1-98151,AK1-98089,AK1-96780,AK1-90919,AK1-96705,AK1-96806,AK1-95397

If you also want to return the status of the ticket (the part in parentheses after the ticket number), you can change the regex to: 如果您还想返回票证的状态(票证编号后面括号中的部分),则可以将正则表达式更改为:

"\bAK1-\d+(?:\s*\([^)]+\))?"

And if your ticket patterns are different, you can also alter the regex accordingly. 并且,如果您的票证模式不同,则还可以相应地更改正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM