繁体   English   中英

根据搜索词列表将文本从 Word 复制到 Excel

[英]Copy text from Word to Excel based on a list of search words

尊敬的论坛成员您好,

在我大学的一项研究工作中,我必须根据关键字将 Word 文档中的文本段落转换为 Excel 文件。

这是一个关键字列表(在 Excel 列中都在彼此下面列出)和几个 Word 文档(大约 80-100 个,每个文档 400 页)。

程序应该在Word文档中搜索关键字,如果找到一个词,则将相应的词+该词前后350个字符复制到Excel行中。 此外,还应复印文档名称和页数。 每个找到的单词都应该复制到一个新行中。

根据谷歌的初步研究,我收到了以下代码。 其中大部分已经可以使用此代码。

我需要你的帮助解决以下两点:

1.) 如何扩展要复制的文本? 如果在word文档中找到搜索词,则复制该词前后的word+350个字符。

2.) 一个循环应该是什么样子,以便一个文件夹中的所有 Word 文档一个接一个地处理?

由于我尝试了很长时间没有找到解决方案,我对每个提示或解决方案都很满意。

Sub LocateSearchItem_Test22()
Dim shtSearchItem As Worksheet
Dim shtExtract As Worksheet
Dim oWord As Word.Application
Dim WordNotOpen As Boolean
Dim oDoc As Word.Document
Dim oRange As Word.Range
Dim LastRow As Long                 
Dim CurrRowShtSearchItem As Long    
Dim CurrRowShtExtract As Long      
Dim myPara As Long
Dim myLine As Long
Dim myPage As Long
Dim oDocName As Variant

    On Error Resume Next

    Application.ScreenUpdating = False

    Set oWord = GetObject(, "Word.Application")

    If Err Then
        Set oWord = New Word.Application
        WordNotOpen = True
    End If

    On Error GoTo Err_Handler

    oWord.Visible = True
    oWord.Activate
    Set oDoc = oWord.Documents.Open("C:\Users\Lenovo\Downloads\Data fronm Word to Excel\Testdatei.docx")       

    oDocName = ActiveDocument.Name

    Set shtSearchItem = ThisWorkbook.Worksheets(1)
    If ThisWorkbook.Worksheets.Count < 2 Then
        ThisWorkbook.Worksheets.Add After:=shtSearchItem
    End If
    Set shtExtract = ThisWorkbook.Worksheets(2)

    LastRow = shtSearchItem.UsedRange.Rows(shtSearchItem.UsedRange.Rows.Count).Row

    For CurrRowShtSearchItem = 2 To LastRow
        Set oRange = oDoc.Range
        With oRange.Find
            .Text = shtSearchItem.Cells(CurrRowShtSearchItem, 1).Text
            .MatchCase = False
            '.MatchWholeWord = False
            .MatchWildcards = True
            While oRange.Find.Execute = True
                oRange.Select
                myPara = oDoc.Range(0, oWord.Selection.Paragraphs(1).Range.End).Paragraphs.Count
                myPage = oWord.Selection.Information(wdActiveEndAdjustedPageNumber)
                myLine = oWord.Selection.Information(wdFirstCharacterLineNumber)

                CurrRowShtExtract = CurrRowShtExtract + 1

                    shtExtract.Cells(CurrRowShtExtract, 1).Value = .Text
                    shtExtract.Cells(CurrRowShtExtract, 2).Value = myPara
                    shtExtract.Cells(CurrRowShtExtract, 3).Value = myPage
                    shtExtract.Cells(CurrRowShtExtract, 4).Value = myLine
                    shtExtract.Cells(CurrRowShtExtract, 5).Value = oDocName
                    shtExtract.Cells(CurrRowShtExtract, 6) = oDoc.Paragraphs(myPara).Range

                oRange.Collapse wdCollapseEnd

            Wend
        End With
    Next CurrRowShtSearchItem

    If WordNotOpen Then
        oWord.Quit
    End If

    'Release object references

    Set oWord = Nothing
    Set oDoc = Nothing

    Exit Sub

Err_Handler:
    MsgBox "Word caused a problem. " & Err.Description, vbCritical, "Error: " & Err.Number
    If WordNotOpen Then
        oWord.Quit
    End If

End Sub

我将特别关注 Word 部分,因为这是我的专长。 看起来您对 VBA 有一定的了解,所以我只想用片段来回答。

这是你的发现:

With oRange.Find
    .Text = shtSearchItem.Cells(CurrRowShtSearchItem, 1).Text
    .MatchCase = False
    '.MatchWholeWord = False
    .MatchWildcards = True 'do you really want wildcards?
    .Wrap = wdFindStop
    While .Execute = True
        myPara = oDoc.Range(0, oRange.End).Paragraphs.Count
        myPage = oRange.Information(wdActiveEndAdjustedPageNumber)
        myLine = oRange.Information(wdFirstCharacterLineNumber)
'Expand range size begins here        
        oRange.MoveStart wdCharacter, -350 'not sure if you want the info of just the word or the word +/- 350 characters
        oRange.MoveEnd wdCharacter, 350

        CurrRowShtExtract = CurrRowShtExtract + 1

                    shtExtract.Cells(CurrRowShtExtract, 1).Value = .Text
                    shtExtract.Cells(CurrRowShtExtract, 2).Value = myPara
                    shtExtract.Cells(CurrRowShtExtract, 3).Value = myPage
                    shtExtract.Cells(CurrRowShtExtract, 4).Value = myLine
                    shtExtract.Cells(CurrRowShtExtract, 5).Value = oDocName
                    shtExtract.Cells(CurrRowShtExtract, 6) = oRange.Text

                oRange.Collapse wdCollapseEnd
    Wend
End With

如果你能帮上忙,永远不要选择任何东西。 Word 中的几乎所有内容都可以在从未使用过选择的情况下完成。 声明范围并操纵范围。 没有必要选择它。

至于遍历文件夹中的每个文档,请查看FileSystemObject 文档很糟糕,但谷歌的结果通常相当不错。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM