简体   繁体   中英

Extract ONLY THE FIRST match from Word doc using RegEx lunched from Excel VBA

There is a document like this one. I process 20 documents like this every day and they all look the same (structure, I mean, is very consistent).

在此处输入图片说明

The goal of this macro is to extract ONLY THE FIRST match of the RegEx pattern from the .ActiveDocument.Content . In the whole doc there is many more matches, but I need only the first one. The document being processed will be manually opened before the macro would run.

I'm just a VBA beginner so if there is a possibility to write it without using arrays, collections or some dictionaries I'd much appreciate. There is just one item to extract, so it's best to load it inside repNmbr string variable and from there just ws.Range("G30").Value = repNmbr . The simpler the better.

I used these resources Excel Regex Tutorial (Regular Expressions) which is very helpful but I still don't know how to load the FIRST MATCH alone into my repNmbr string variable. I'd like to do this without using any loop, because I just want to load a single string into this repNmbr variable.

Currently I have code like this:

Sub ExtractRepertor03()
    'Application.ScreenUpdating = False
    Dim WordApp As Word.Application
    Dim WordDoc As Word.Document
    Dim ExcelApp As Excel.Application
    Dim rng As Word.Range
    Dim ws As Worksheet
    Dim regEx As Object
    Dim matches As MatchCollection
    Dim match As String
    Dim repNmbr As String

    'Assigning object variables
    Set WordApp = GetObject(, "Word.Application")      'ActiveX can't create object is when
    Set ExcelApp = GetObject(, "Excel.Application")    'there is no Word document open;
    Set regEx = CreateObject("VBScript.RegExp")
    Set WordDoc = WordApp.ActiveDocument
    Set rng = WordApp.ActiveDocument.Content

    'Create the regular expression object
    regEx.Global = False    'because I need only the first match instead of all occurences;
    regEx.IgnoreCase = True
    regEx.Pattern = "([0-9]{1,5})([ ]{0,4})([/])([0-9]{4})"
    'regEx.Pattern = "([0-9]{1,5})([\s]{0,4})(/[0-9]{4})"

    repNmbr = regEx.Execute(rng.text)      'here is something wrong but I don't know what;
                            'I'm trying to assign the first RegEx match to repNmbr variable;
    Debug.Print repNmbr
    repNmbr = Replace(repNmbr, " ", "") 
'    Set matches = regEx.Execute(rng.text)
'    Debug.Print regEx.Test(rng)
'    'Debug.Print regEx.Value
'        For Each match In matches    'I just want this macro run without the loop
'           Debug.Print match.Value   'Result: 9042 /2019
'           repNmbr = match.Value
'        Next match

    ExcelApp.Application.Visible = True
    ws.Range("G30").Value = repNmbr
End Sub

And an error like this:

在此处输入图片说明

Can someone explain to me why Set matches = regEx.Execute(rng.text) works fine but repNmbr = regEx.Execute(rng.text) returns the error: "Wrong number of arguments or invalid property assignment"??

After regEx.Global = False is set, the RegEx finds only a single value, so why VBA refuses to assign this string into the repNmbr string variable??

As I said in your other question, you don't need the RegEx library for this. Stick to Word's wildcards ! Try:

Sub Demo()
Application.ScreenUpdating = False
Dim WordApp As Word.Application
Set WordApp = GetObject(, "Word.Application")
With WordApp.ActiveDocument.Range
  With .Find
    .Text = "<[0-9 ]{1,7}/[0-9]{4}>"
    .MatchWildcards = True
    .Wrap = wdFindStop
    .Forward = True
    .Execute
  End With
  If .Find.Found = True Then ActiveSheet.Range("G30").Value = Replace(.Text, " ", "")
End With
Application.ScreenUpdating = True
End Sub

Note : I haven't bothered with any of:

Dim ExcelApp As Excel.Application
Dim rng As Word.Range
Dim ws As Worksheet
Dim regEx As Object
Dim matches As MatchCollection
Dim match As String
Dim repNmbr As String

as it's all superfluous - even your own code never assigns anything to ws.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM