简体   繁体   中英

how do i extract only 5-digit strings from cells in excel?

I have a bunch of data which contains any number of 5-digit strings in completely inconsistent formats, and i want to extract these 5-digit strings (in bold) out. I am not bothered about strings containing less than or more than 5-digits. as an example, this is the kind of data i have in my file

Cell A1: "1. 76589 - wholesale activities. 2. 33476 - general"

Cell A2: "WHOLESALE ACTIVITIES ( 76589 ). SHIPPING ( 12235 ). REAL ESTATE ACTIVITIES ( 67333 )"

Cell A3: "1. 33476 General. 658709 annual road. Unknown 563"

I've tried the usual SEARCH/FIND , MIN , LEFT/RIGHT/MID functions, but am not sure how to get them to produce the result i need, and even text-to-columns wasn't giving me a clean result

thanks in advance

Here is a macro that will split your line into the columns as you requested.

The range being processed is whatever you have selected. The results are written into the adjacent columns on the same row.

Depending on your worksheet setup, you may want to "clear out" the rows where the results are going before executing the extraction code.

You can also write code to select the data to be processed automatically. Plenty of examples on this forum.


Option Explicit
Sub Extract5Digits()
    Dim R As Range, C As Range
    Dim RE As Object, MC As Object, M As Object
    Dim I As Long

Set R = Selection
Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = True
    .Pattern = "\b\d{5}\b"
    For Each C In R
        If .test(C.Text) = True Then
            I = 0
            Set MC = .Execute(C.Text)
            For Each M In MC
                I = I + 1
                C.Offset(0, I) = M
            Next M
        End If
    Next C
End With
End Sub

在此输入图像描述

Simply with Excel functions this is impossibile.

The best way for you is to use the Regex 55 library in VBA.

Let's consider this example:

+---+--------------------------------------------------------------+
|   |                              A                               |
+---+--------------------------------------------------------------+
| 1 | Cell A3: "1. 33476 General. 658709 annual road. Unknown 563" |
| 2 | 33476                                                        |
+---+--------------------------------------------------------------+

From the Excel file hit Alt + F11 , then go to Tools => Reference and select " Microsoft VBScript Regular Expression 5.5 ".

Then you can use the following function definition:

Public Function Get5DigitsNumer(search_str As String)
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches
    GetStringInParens = ""
    regEx.Pattern = "[0-9]{5}"
    regEx.Global = True
    If regEx.test(search_str) Then
        Set matches = regEx.Execute(search_str)
        GetStringInParens = matches(0).SubMatches(0)
    End If
End Function

At this time you can use the following code:

Sub PatternExtractor()
    Range("A2").Value = Get5DigitsNumer(Range("A1"))
End Sub

which take the value of cell A1 and extract the 5 digits numer, thn the result is saved into cell A2.

At the time I don't have any idea how this code could work where the same cell contains more than one time; like " Cell A1: "1. 76589 - wholesale activities. 2. 33476 - general " in your example.

I suggest you to have a look at this answer . The pattern is different but the question is really similar to yours.

The only way that you can do it is by writing a regex in VBA. I would recommend you to look at this question .

i have see the code to get 5 digit, its amazing, i have modified the code to extract 8 digit, but the Data i have, i need to extract different number,

i need to extract the number with 8 digit, and other one with 9 digit.

Example : WG: 54627489 KD NR. 0285542106 Matthias Dannhauer.

thank you in advance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM