[英]Copy text from Word to Excel based on a list of search words
尊敬的论坛成员您好,
在我大学的一项研究工作中,我必须根据关键字将 Word 文档中的文本段落转换为 Excel 文件。
这是一个关键字列表(在 Excel 列中都在彼此下面列出)和几个 Word 文档(大约 80-100 个,每个文档 400 页)。
程序应该在Word文档中搜索关键字,如果找到一个词,则将相应的词+该词前后350个字符复制到Excel行中。 此外,还应复印文档名称和页数。 每个找到的单词都应该复制到一个新行中。
根据谷歌的初步研究,我收到了以下代码。 其中大部分已经可以使用此代码。
我需要你的帮助解决以下两点:
1.) 如何扩展要复制的文本? 如果在word文档中找到搜索词,则复制该词前后的word+350个字符。
2.) 一个循环应该是什么样子,以便一个文件夹中的所有 Word 文档一个接一个地处理?
由于我尝试了很长时间没有找到解决方案,我对每个提示或解决方案都很满意。
Sub LocateSearchItem_Test22()
Dim shtSearchItem As Worksheet
Dim shtExtract As Worksheet
Dim oWord As Word.Application
Dim WordNotOpen As Boolean
Dim oDoc As Word.Document
Dim oRange As Word.Range
Dim LastRow As Long
Dim CurrRowShtSearchItem As Long
Dim CurrRowShtExtract As Long
Dim myPara As Long
Dim myLine As Long
Dim myPage As Long
Dim oDocName As Variant
On Error Resume Next
Application.ScreenUpdating = False
Set oWord = GetObject(, "Word.Application")
If Err Then
Set oWord = New Word.Application
WordNotOpen = True
End If
On Error GoTo Err_Handler
oWord.Visible = True
oWord.Activate
Set oDoc = oWord.Documents.Open("C:\Users\Lenovo\Downloads\Data fronm Word to Excel\Testdatei.docx")
oDocName = ActiveDocument.Name
Set shtSearchItem = ThisWorkbook.Worksheets(1)
If ThisWorkbook.Worksheets.Count < 2 Then
ThisWorkbook.Worksheets.Add After:=shtSearchItem
End If
Set shtExtract = ThisWorkbook.Worksheets(2)
LastRow = shtSearchItem.UsedRange.Rows(shtSearchItem.UsedRange.Rows.Count).Row
For CurrRowShtSearchItem = 2 To LastRow
Set oRange = oDoc.Range
With oRange.Find
.Text = shtSearchItem.Cells(CurrRowShtSearchItem, 1).Text
.MatchCase = False
'.MatchWholeWord = False
.MatchWildcards = True
While oRange.Find.Execute = True
oRange.Select
myPara = oDoc.Range(0, oWord.Selection.Paragraphs(1).Range.End).Paragraphs.Count
myPage = oWord.Selection.Information(wdActiveEndAdjustedPageNumber)
myLine = oWord.Selection.Information(wdFirstCharacterLineNumber)
CurrRowShtExtract = CurrRowShtExtract + 1
shtExtract.Cells(CurrRowShtExtract, 1).Value = .Text
shtExtract.Cells(CurrRowShtExtract, 2).Value = myPara
shtExtract.Cells(CurrRowShtExtract, 3).Value = myPage
shtExtract.Cells(CurrRowShtExtract, 4).Value = myLine
shtExtract.Cells(CurrRowShtExtract, 5).Value = oDocName
shtExtract.Cells(CurrRowShtExtract, 6) = oDoc.Paragraphs(myPara).Range
oRange.Collapse wdCollapseEnd
Wend
End With
Next CurrRowShtSearchItem
If WordNotOpen Then
oWord.Quit
End If
'Release object references
Set oWord = Nothing
Set oDoc = Nothing
Exit Sub
Err_Handler:
MsgBox "Word caused a problem. " & Err.Description, vbCritical, "Error: " & Err.Number
If WordNotOpen Then
oWord.Quit
End If
End Sub
我将特别关注 Word 部分,因为这是我的专长。 看起来您对 VBA 有一定的了解,所以我只想用片段来回答。
这是你的发现:
With oRange.Find
.Text = shtSearchItem.Cells(CurrRowShtSearchItem, 1).Text
.MatchCase = False
'.MatchWholeWord = False
.MatchWildcards = True 'do you really want wildcards?
.Wrap = wdFindStop
While .Execute = True
myPara = oDoc.Range(0, oRange.End).Paragraphs.Count
myPage = oRange.Information(wdActiveEndAdjustedPageNumber)
myLine = oRange.Information(wdFirstCharacterLineNumber)
'Expand range size begins here
oRange.MoveStart wdCharacter, -350 'not sure if you want the info of just the word or the word +/- 350 characters
oRange.MoveEnd wdCharacter, 350
CurrRowShtExtract = CurrRowShtExtract + 1
shtExtract.Cells(CurrRowShtExtract, 1).Value = .Text
shtExtract.Cells(CurrRowShtExtract, 2).Value = myPara
shtExtract.Cells(CurrRowShtExtract, 3).Value = myPage
shtExtract.Cells(CurrRowShtExtract, 4).Value = myLine
shtExtract.Cells(CurrRowShtExtract, 5).Value = oDocName
shtExtract.Cells(CurrRowShtExtract, 6) = oRange.Text
oRange.Collapse wdCollapseEnd
Wend
End With
如果你能帮上忙,永远不要选择任何东西。 Word 中的几乎所有内容都可以在从未使用过选择的情况下完成。 声明范围并操纵范围。 没有必要选择它。
至于遍历文件夹中的每个文档,请查看FileSystemObject
。 文档很糟糕,但谷歌的结果通常相当不错。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.