简体   繁体   English

c#Office Interop在大型文档中循环浏览每个单词时速度很慢

[英]c# Office Interop Slow When Looping through each word in a large document

I have the following code segment which loops through each word in a Word document, finds lines (sentences) which are bolded and stores them in a list, MasterSentences . 我有以下代码段,循环遍历Word文档中的每个单词,找到加粗的行(句子)并将它们存储在列表MasterSentences

Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();
Document doc = new Document();
object path = fileDialog.FileName;
object missing = Type.Missing;

doc = word.Documents.Open(ref path, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
string sentence = "";
foreach (Range rng in doc.StoryRanges)
{
    foreach (Range rngWord in rng.Words)
    {
        if ((rngWord.Text.Contains("\n") || rngWord.Text.Contains("\r")) && sentence != "")
        {
            MasterSentences.Add(sentence);
            sentence = "";
        }
        else if (rngWord.Bold != 0 && rngWord.Text != " " && rngWord.Text != "\t")
        {
            sentence += rngWord.Text;
        }
    }
}

The problem is this takes around 3-4 minutes to complete for a Word document with 23,742 words. 问题是,对于包含23742个单词的Word文档,这大约需要3-4分钟才能完成。

Is there any way to improve the speed? 有什么办法可以提高速度? Is there a more efficient way to accomplish this? 有没有更有效的方法来实现这一目标?

Thank you guys for all your feedback. 谢谢大家的反馈。 It seems using Range.Find.Execute was the right way. 似乎使用Range.Find.Execute是正确的方法。 Here's my updated mostly working code. 这是我更新的主要工作代码。 It takes only around 20 seconds now which is perfect. 现在只需要20秒钟左右,这是完美的。

Range rngFindBold = doc.Range();
rngFindBold.Find.Font.Bold = 1;
while (rngFindBold.Find.Execute(Format: true))
{
    if (!string.IsNullOrWhiteSpace(rngFindBold.Text))
    {
        MasterSentences.Add(rngFindBold.Text);
    }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 C#Excel Interop在单元格中循环时变慢 - C# Excel Interop Slow when looping through cells C#Microsoft.Office.Interop.Word.Document流 - c# Microsoft.Office.Interop.Word.Document to Stream C#Microsoft.Office.Interop.Word - C# Microsoft.Office.Interop.Word 如何将整个datagridview导出成word文档(在C#中使用office word interop DLL) - How to export the entire datagridview into a word document (using office word interop DLL in C#) 如何在没有office.word.interop C#的情况下将带有图表的MS Word文档转换为PDF - How to convert MS word document with chart in it to PDF without office.word.interop c# 确定是否选中了word文档中的复选框 - c#中的char u0015、microsoft.office.interop.word - Determine if checkbox in word document is checked or not - char u0015, microsoft.office.interop.word in c# 如何使用Microsoft.Office.Interop.Word C#在Word文档中插入/获取封面 - How to insert/fetch a cover page in word document using Microsoft.Office.Interop.Word C# 将复选框添加到Office Word C#互操作 - Adding a Checkbox to Office Word C# Interop c#,Microsoft.Office.Interop.Word-文档对象内容中的索引与文档中的实际位置之间的差异 - c#, Microsoft.Office.Interop.Word - Discrepancy between indexes in document object content and actual positions in the document 如何在不使用microsoft.office.interop的情况下将Word文档转换为C#中的文本文件? - How to convert a word document to a text file in c# without using microsoft.office.interop?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM