[英]How to get the range of first paragraph on each page in Word Document using C# Word interlop
I have a word file with 9 pages.我有一个 9 页的 word 文件。
I use this:我用这个:
Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document wordDoc = wordApp.Documents.Open(file);
Microsoft.Office.Interop.Word.Range docRange = wordDoc.Range();
But, this code will give me range of all paragraph.但是,这段代码会给我所有段落的范围。
How to get the range of text in fist line (or first paragraph) of each page using C# Word interlop?如何使用 C# Word interlop 获取每页第一行(或第一段)中的文本范围?
Sorry about my english...对不起我的英语...
Ex: At the first page i want to get text:例如:在第一页我想获取文本:
" Apple Inc. is an American multinational technology company headquartered in Cupertino, California, "
“ Apple Inc. 是一家美国跨国科技公司,总部位于加利福尼亚州库比蒂诺, ”
or first paragraph或第一段
" Apple Inc. is an American multinational technology company headquartered in Cupertino, California, that designs, develops, and sells consumer electronics, computer software, and online services. It is considered one of the Big Four technology companies, alongside Amazon, Google, and Microsoft. "
" Apple Inc. 是一家美国跨国科技公司,总部位于加利福尼亚州库比蒂诺,设计、开发和销售消费电子产品、计算机软件和在线服务。它被认为是与亚马逊、谷歌和谷歌并列的四大科技公司之一。微软。 ”
The second page is:第二页是:
the Text i want:我想要的文字:
Apple was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976 to develop and sell
Apple 由史蒂夫·乔布斯、史蒂夫·沃兹尼亚克和罗纳德·韦恩于 1976 年 4 月创立,旨在开发和销售
or或者
Apple was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976 to develop and sell Wozniak's Apple I personal computer, though Wayne sold his share back within 12 days.
苹果公司由史蒂夫·乔布斯、史蒂夫·沃兹尼亚克和罗纳德·韦恩于 1976 年 4 月创立,旨在开发和销售沃兹尼亚克的 Apple I 个人电脑,但韦恩在 12 天内卖回了他的股份。
You can try iterate through all paragraphs and get page number.您可以尝试遍历所有段落并获取页码。 Then select the first paragraph of the page.
然后是select页面第一段。
using Word = Microsoft.Office.Interop.Word;
private void FindFirstParagraphOfEachPage()
{
Word.Application wordApp = new Word.Application();
Word.Document wordDoc = wordApp.Documents.Open(filePath);
Word.Range docRange = wordDoc.Range();
var paragraphs = new List<Paragraph>();
foreach (Word.Paragraph p in wordDoc.Paragraphs)
{
paragraphs.Add(new Paragraph()
{
PageNumber = (int)p.Range.get_Information(Word.WdInformation.wdActiveEndPageNumber),
ParagraphText = p.Range.Text.ToString()
});
}
var result = paragraphs.Where(x => !string.IsNullOrWhiteSpace(x.ParagraphText))
.GroupBy(x => x.PageNumber)
.Select(x => x.First());
wordDoc.Close();
wordApp.NormalTemplate.Saved = true;
wordApp.Quit();
}
Helper class to store page number and paragraph text.帮助程序 class 存储页码和段落文本。
class Paragraph
{
public int PageNumber { get; set; }
public string ParagraphText { get; set; }
}
I am not sure about releasing the objects.我不确定释放这些对象。 It probably will require some edits and testing.
它可能需要一些编辑和测试。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.