![](/img/trans.png)
[英]How can I determine paragraph and line indents in MS Word documents with c#?
[英]How to read MS Word paragraph and table content line by line
我正在使用Microsoft.Office.Interop.Word
在C#(3.5)中阅读Word文档。 逐行读取,将行拆分为array []并处理行中的每个单词,并基于一些业务逻辑替换了某些单词,并在替换了单词之后,将整个行替换为转换后的行。
到现在为止,一切正常。
现在我有一些Word文档,这些文档具有段落和表格。 我想一一阅读表的每一列,并替换特定列中的列内容。
更新资料
使用办公自动化
1. Opening word file.
2. Moving cursor to top of the document
3. Selecting first line using (`wordApp.Selection.endKey`) and processing all words
4. After processing the words replacing the selected line with the processed line.
5. Using wordApp.Selection.MoveDown(ref lineCount, ref countPage, ref MISSING);
moving next line processed further.
问题:1.读取表时,使用wordApp.Selection.endKey
时仅读取第一列
我要处理所有列的数据。 有什么方法可以识别内容是段落还是表格?
使用选择来扫描文档应该在性能上非常昂贵。 我建议以下代码:
List<Word.Range> TablesRanges = new List<Word.Range>();
wordApp = new Microsoft.Office.Interop.Word.Application();
doc = wordApp.Documents.OpenNoRepairDialog(FileName: @"c:\AAAAA.docx", ConfirmConversions: false, ReadOnly: true, AddToRecentFiles: false, NoEncodingDialog: true);
for (int iCounter = 1; iCounter <= doc.Tables.Count; iCounter++)
{
Word.Range TRange = doc.Tables[iCounter].Range;
TablesRanges.Add(TRange);
}
Boolean bInTable;
for (int par = 1; par <= doc.Paragraphs.Count; par++)
{
bInTable = false;
Word.Range r = doc.Paragraphs[par].Range;
foreach (Word.Range range in TablesRanges)
{
if (r.Start >= range.Start && r.Start <= range.End)
{
Console.WriteLine("In Table - Paragraph number " + par.ToString() + ":" + r.Text);
bInTable = true;
break;
}
}
if (!bInTable)
Console.WriteLine("!!!!!! Not In Table - Paragraph number " + par.ToString() + ":" + r.Text);
}
我发现了一个相同的解决方法。 方法在下面列出。
1.使用WordApp.Documents.Open()
打开Word文档
2.使用Selection.MoveDown
逐行遍历Document
3.跳过表格单元格的内容
4.最后仅处理文档表
//Process all Paragraphs in the documents
while (doc.ActiveWindow.Selection.Bookmarks.Exists(@"\EndOfDoc") == false)
{
doc.ActiveWindow.Selection.MoveDown(ref wdLine, ref wdCountOne, ref wdMove);
doc.ActiveWindow.Selection.HomeKey(ref wdLine, ref wdMove);
//Skiping table content
if (doc.ActiveWindow.Selection.get_Information(WdInformation.wdEndOfRangeColumnNumber).ToString() != "-1")
{
while (doc.ActiveWindow.Selection.get_Information(WdInformation.wdEndOfRangeColumnNumber).ToString() != "-1")
{
if (doc.ActiveWindow.Selection.Bookmarks.Exists(@"\EndOfDoc"))
break;
doc.ActiveWindow.Selection.MoveDown(ref wdLine, ref wdCountOne, ref wdMove);
doc.ActiveWindow.Selection.HomeKey(ref wdLine, ref wdMove);
}
doc.ActiveWindow.Selection.HomeKey(ref wdLine, ref wdMove);
}
doc.ActiveWindow.Selection.EndKey(ref wdLine, ref wdExtend);
currLine = doc.ActiveWindow.Selection.Text;
}
//Processing all tables in the documents
for (int iCounter = 1; iCounter <= doc.Tables.Count; iCounter++)
{
foreach (Row aRow in doc.Tables[iCounter].Rows)
{
foreach (Cell aCell in aRow.Cells)
{
currLine = aCell.Range.Text;
//Process Line
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.