[英]Parse word document in VBScript
I got a weird mission from a friend, to parse through a bunch of Word files and write certain parts of them to a text file for further processing. 我从一个朋友那里得到了一个怪异的任务,那就是解析一堆Word文件并将它们的某些部分写入文本文件以进行进一步处理。
VBscript is not my cup of tea so I'm not sure how to fit the pieces together. VBscript不是我的杯水,所以我不确定如何将各个部分组合在一起。
The documents look like this: 这些文档如下所示:
Header
A lot of not interesting text
Table
Header
More boring text
Table
I want to parse the documents and get all the headers and table of contents out of it. 我想解析文档,并从中获取所有标题和目录。 I'm stepping step through the document with
我正在逐步浏览文档
For Each wPara In wd.ActiveDocument.Paragraphs
And I think I know how to get the headers 而且我想我知道如何获取标题
If Left(wPara.Range.Style, Len("Heading")) = "Heading" Then
But I'm unsure of how to do the 但我不确定该怎么做
Else if .. this paragraph belongs to a table..
So, any hint on how I could determine if a paragraph is part of a table or not would be nice. 因此,关于如何确定段落是否属于表的任何提示都很好。
Untested, because I have no access to MS Word right now. 未经测试,因为我现在无法访问MS Word。
Option Explicit
Dim FSO, Word, textfile, doc, para
' start Word instance, open doc ...
' start FileSystemObject instance, open textfile for output...
For Each para In doc.Paragraphs
If IsHeading(para) Or IsInTable(para) Then
SaveToFile(textfile, para)
End If
Next
Function IsHeading(para)
IsHeading = para.OutlineLevel < 10
End Function
Function IsInTable(para)
Dim p, dummy
IsInTable = False
Set p = para.Parent
' at some point p and p.Parent will both be the Word Application object
Do While p Is Not p.Parent
' dirty check: if p is a table, calling a table object method will work
On Error Resume Next
Set dummy = obj.Cell(1, 1)
If Err.Number = 0 Then
IsInTable = True
Exit Do
Else
Err.Clear
End If
On Error GoTo 0
Set p = p.Parent
Loop
End Function
Obviously SaveToFile
is something you'd implement yourself. 显然,
SaveToFile
是您自己实现的东西。
Since "is in table" is naturally defined as "the object's parent is a table", this is a perfect situation to use recursion (deconstructed a little further): 由于“在表中”很自然地被定义为“对象的父表是表”,因此使用递归是一个完美的情况(进一步解构):
Function IsInTable(para)
IsInTable = IsTable(para.Parent)
If Not (IsInTable Or para Is para.Parent) Then
IsInTable = IsInTable(para.Parent)
End If
End Function
Function IsTable(obj)
Dim dummy
On Error Resume Next
Set dummy = obj.Cell(1, 1)
IsTable = (Err.Number = 0)
Err.Clear
On Error GoTo 0
End Function
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.