简体   繁体   English

使用 VBA 解析 MS Word 文档中的文本

[英]Using VBA to parse text in an MS Word document

I was hoping someone could help with a MS Word Macro.我希望有人可以帮助使用 MS Word 宏。

Basically, I have a MS Word document which lists out several text files and specific pages of interest in each file.基本上,我有一个 MS Word 文档,其中列出了几个文本文件和每个文件中感兴趣的特定页面。

The file format is similar to:文件格式类似于:

textdocument1.txt              P. 6, 12 - issue1
textdocument2.txt              P. 5 - issue1
                               P. 13, 17 - issue3
textdocument3.txt              P. 10

I want to read each line into my Macro as a string.我想将每一行作为字符串读入我的宏。

Then traverse through it to identify the file name.然后遍历它来识别文件名。 With the file name, I can then open the file, go to the page number, and copy the data I need.有了文件名,我就可以打开文件,转到页码,然后复制我需要的数据。

But I'm stuck at step 1, how do I capture the line into a string in an MS Word Macro?但是我被困在第 1 步,如何在 MS Word 宏中将该行捕获为字符串?

Any help will be appreciated.任何帮助将不胜感激。

The following code should get you started:以下代码应该可以帮助您入门:

Public Sub ParseLines()
    Dim singleLine As Paragraph
    Dim lineText As String

    For Each singleLine In ActiveDocument.Paragraphs
        lineText = singleLine.Range.Text

        '// parse the text here...

    Next singleLine
End Sub

I found the basic algorithm in this article .我在这篇文章中找到了基本算法。

If your word document lists all the text files like this:如果你的 word 文档列出了所有这样的文本文件:

<name>{tab}<page ref>{newline}
<name>{tab}<page ref>{newline}
<name>{tab}<page ref>{newline}

Then all the lines are available in the Paragraphs collection .然后所有的行都在Paragraphs 集合中可用。 You can loop through that with a simple For Each loop:你可以用一个简单的For Each循环来遍历它:

Dim p As Paragraph

For Each p In ActiveDocument.Paragraphs
  Debug.Print p.Range.Text
Next p

per line每行

Public Sub ParseDoc()

    Dim doc As Document
    Set doc = ActiveDocument
    Dim paras As Paragraphs
    Set paras = doc.Paragraphs
    Dim para As Paragraph
    Dim sents As Sentences
    Dim sent As Range
    For Each para In paras

        Set sents = para.Range.Sentences
        For Each sent In sents
            Debug.Print sent.Text
        Next

    Next

End Sub

if text is in other contains special character or is in another language the above code will not work this was the solution i came up with on one of task i performed如果其他文本包含特殊字符或使用另一种语言,则上述代码将不起作用,这是我在执行的一项任务中提出的解决方案

    Dim para As Paragraph
    Dim sentence() As String
    For Each para In ActiveDocument.Paragraphs
          sentence() = Split(para.Range.Text, Chr(11))
            For i = 0 To UBound(sentence)
                  Msgbox(sentence(i))
            next i
    next

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM