简体   繁体   English

如何以编程方式(C#)确定.docx文件的页数

[英]How to programatically (C#) determine the pages count of .docx files

I have about 400 files in .docx format, and I need to determine the length of each in #pages. 我有大约400个.docx格式的文件,我需要在#pages中确定每个文件的长度。

So, I want to write C# code for selecting the folder that contains the documents , and then returns the #pages of each .docx file. 所以,我想编写C#代码来选择包含文档的文件夹,然后返回每个.docx文件的#pages。

To illustrate how this can be done, I have just created a C# console application based on .NET 4.5 and some of the Microsoft Office 2013 COM objects. 为了说明如何做到这一点,我刚刚创建了一个基于.NET 4.5和一些Microsoft Office 2013 COM对象的C#控制台应用程序。

using System;
using Microsoft.Office.Interop.Word;

namespace WordDocStats
{
    class Program
    {
        // Based on: http://www.dotnetperls.com/word
        static void Main(string[] args)
        {
            // Open a doc file.
            var application = new Application();
            var document = application.Documents.Open(@"C:\Users\MyName\Documents\word.docx");

            // Get the page count.
            var numberOfPages = document.ComputeStatistics(WdStatistic.wdStatisticPages, false);

            // Print out the result.
            Console.WriteLine(String.Format("Total number of pages in document: {0}", numberOfPages));

            // Close word.
            application.Quit();
        }
    }
}

For this to work you need to reference the following COM objects: 为此,您需要引用以下COM对象:

  • Microsoft Office Object Library (version 15.0 in my case) Microsoft Office对象库(在我的案例中为15.0版)
  • Microsoft Word Object Library (version 15.0 in my case) Microsoft Word对象库(在我的情况下为15.0版)

The two COM objects gives you access to the namespaces needed. 两个COM对象使您可以访问所需的命名空间。

For details on how to reference the correct assemblies, please refer to section: "3. Setting Up Work Environment:" at: http://www.c-sharpcorner.com/UploadFile/amrish_deep/WordAutomation05102007223934PM/WordAutomation.aspx 有关如何引用正确组件的详细信息,请参阅“3.设置工作环境:”部分: http//www.c-sharpcorner.com/UploadFile/amrish_deep/WordAutomation05102007223934PM/WordAutomation.aspx

For a quick and more general introduction to Word automation through C#, see: http://www.dotnetperls.com/word 有关通过C#进行Word自动化的快速和一般性介绍,请参阅: http//www.dotnetperls.com/word

-- UPDATE - 更新

Documentation about the method Document.ComputeStatistics that gives you access to the page count can be found here: http://msdn.microsoft.com/en-us/library/microsoft.office.tools.word.document.computestatistics.aspx 可以在此处找到有关方法Document.ComputeStatistics ,该方法可让您访问页数: http//msdn.microsoft.com/en-us/library/microsoft.office.tools.word.document.computestatistics.aspx

As seen in the documentation, the method takes a WdStatistic enum that enables you to retrieve different kinds of stats, eg, the total amount of pages. 如文档中所示,该方法采用WdStatistic枚举,使您能够检索不同类型的统计信息,例如页面总量。 For an overview of the complete range of stats you have access to, please refer to the documentation of the WdStatistic enum, which can be found here: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdstatistic.aspx 有关您可以访问的完整统计信息的概述,请参阅WdStatistic枚举文档,该文档可在此处找到: http//msdn.microsoft.com/en-us/library/microsoft.office。 interop.word.wdstatistic.aspx

use DocumentFormat.OpenXml.dll you can find dll in C:\\Program Files\\Open XML SDK\\V2.0\\lib 使用DocumentFormat.OpenXml.dll你可以在C:\\ Program Files \\ Open XML SDK \\ V2.0 \\ lib中找到dll

Sample code: 示例代码:

DocumentFormat.OpenXml.Packaging.WordprocessingDocument doc = DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(docxPath, false);
            MessageBox.Show(doc.ExtendedFilePropertiesPart.Properties.Pages.InnerText.ToString());

to use DocumentFormat.OpenXml.Packaging.WordprocessingDocument class you need to add following references in your project 要使用DocumentFormat.OpenXml.Packaging.WordprocessingDocument类,您需要在项目中添加以下引用

DocumentFormat.OpenXml.dll & Windowsbase.dll DocumentFormat.OpenXml.dll和Windowsbase.dll

Modern solution (based on Jignesh Thakker's answer ): Open XML SDK is no longer there, but it is published on Github and even support .NET Core. 现代解决方案(基于Jignesh Thakker的回答 ):Open XML SDK不再存在,但它在Github发布 ,甚至支持.NET Core。 You do not need MS Office on the server/running machine. 您不需要在服务器/运行的计算机上使用MS Office。

Install the Nuget package : 安装Nuget包

Install-Package DocumentFormat.OpenXml

The code: 代码:

using DocumentFormat.OpenXml.Packaging;

private int CountWordPage(string filePath)
{
    using (var wordDocument = WordprocessingDocument.Open(filePath, false))
    {
        return int.Parse(wordDocument.ExtendedFilePropertiesPart.Properties.Pages.Text);
    }
}

You can use Spire.Doc page count is free :) 你可以使用Spire.Doc页数是免费的:)

using Spire.Doc;
    public sealed class TestNcWorker
    {
        [TestMethod]
        public void DocTemplate3851PageCount()
        {
            var docTemplate3851 = Resource.DocTemplate3851;
            using (var ms = new MemoryStream())
            {
                ms.Write(docTemplate3851, 0, docTemplate3851.Length);
                Document document = new Document();
                document.LoadFromStream(ms, FileFormat.Docx);
                Assert.AreEqual(2,document.PageCount);
            }
            var barCoder = new BarcodeAttacher("8429053", "1319123", "HR3514");
            var barcoded = barCoder.AttachBarcode(docTemplate3851).Value;
            using (var ms = new MemoryStream())
            {
                ms.Write(barcoded, 0, barcoded.Length);
                Document document = new Document();
                document.LoadFromStream(ms, FileFormat.Docx);
                Assert.AreEqual( 3, document.PageCount);

            }
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM