I have a large number of txt-files, which I want to search for specific words. My approach is using an excel macro to open the txt files in word and than search each occurence of a list of words, which I provide in the excel file. It gives me a list, of how often each word occurs in each document. I have managed to do so, using the following code:
Sub CounterofWords()
Application.ScreenUpdating = False
Dim wdApp As Word.Application
Set wdApp = CreateObject("Word.application")
wdApp.Visible = False
For d = 1 To 23
Dim wdDoc As Word.Document
FName = "C:\Users\Andreas\Desktop\test\" & Cells(d + 1, 11) & "_htm.txt"
On Error GoTo txtdesign
Set wdDoc = wdApp.Documents.Open(filename:=FName)
i = 15
Do While Cells(1, i) <> ""
iCount = 0
Application.ScreenUpdating = False
With wdApp.Selection.Find
.ClearFormatting
.Text = Cells(1, i).Value
Do While .Execute
iCount = iCount + 1
wdApp.Selection.MoveRight
Loop
End With
Cells(d + 1, i).Value = iCount
i = i + 1
Loop
wdDoc.Close savechanges:=False
Set wdDoc = Nothing
Next d
wdApp.Quit
Set wdApp = Nothing
Application.ScreenUpdating = True
Exit Sub
txtdesign:
FName = "C:\Users\Andreas\Desktop\test\" & Cells(d + 1, 11) & "_txt.txt"
Resume
End Sub
Everything works fine so far. Now I want to be able to search for regular expressions. I need this for example to avoid certain combinations of words in my search.
It seems to be a problem that I can not write something like
With wdApp.Selection.regex
Anyways, I don't know how to make regex work in a situation like this and appreciate your help!
The Find method in VBA has limited pattern-matching, using this flag:
Selection.Find.MatchWildcards = True
Note : your code would not get correct results as it is, because the search for each word starts where the previous one left off in the document. You need to "move" to the top of the document for each one:
Selection.HomeKey Unit:=wdStory
But if you need more complex pattern-matching using regular expressions, you'll need a different approach, using the RegExp
class, after referencing "Microsoft VBScript Regular Expressions 5.5". See a great explanation in the accepted answer to this SO question: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops .
Here's an example using regex:
Do While Cells(1, i) <> ""
Application.ScreenUpdating = False
Dim regEx As New RegExp
Dim Matches As MatchCollection
With regEx
.Global = True
.IgnoreCase = True
.Pattern = Cells(1, i).Value
End With
Set Matches = regEx.Execute(wdDoc.Content.Text)
Cells(d + 1, i).Value = Matches.Count
i = i + 1
Loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.