簡體   English   中英

將多個word文檔合並為一個打開Xml

[英]Merge multiple word documents into one Open Xml

我有大約 10 個 word 文檔,這些文檔是使用 open xml 和其他東西生成的。 現在我想創建另一個word文檔,我想將它們一個一個地加入到這個新創建的文檔中。 我希望使用打開的 xml,任何提示都會很明顯。 下面是我的代碼:

 private void CreateSampleWordDocument()
    {
        //string sourceFile = Path.Combine("D:\\GeneralLetter.dot");
        //string destinationFile = Path.Combine("D:\\New.doc");
        string sourceFile = Path.Combine("D:\\GeneralWelcomeLetter.docx");
        string destinationFile = Path.Combine("D:\\New.docx");
        try
        {
            // Create a copy of the template file and open the copy
            //File.Copy(sourceFile, destinationFile, true);
            using (WordprocessingDocument document = WordprocessingDocument.Open(destinationFile, true))
            {
                // Change the document type to Document
                document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
                //Get the Main Part of the document
                MainDocumentPart mainPart = document.MainDocumentPart;
                mainPart.Document.Save();
            }
        }
        catch
        {
        }
    }

更新(使用 AltChunks):

using (WordprocessingDocument myDoc = WordprocessingDocument.Open("D:\\Test.docx", true))
        {
            string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2) ;
            MainDocumentPart mainPart = myDoc.MainDocumentPart;
            AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
                AlternativeFormatImportPartType.WordprocessingML, altChunkId);
            using (FileStream fileStream = File.Open("D:\\Test1.docx", FileMode.Open))
                chunk.FeedData(fileStream);
            AltChunk altChunk = new AltChunk();
            altChunk.Id = altChunkId;
            mainPart.Document
                .Body
                .InsertAfter(altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());
            mainPart.Document.Save();
        } 

當我使用多個文件時,為什么這段代碼會覆蓋最后一個文件的內容? 更新 2:

 using (WordprocessingDocument myDoc = WordprocessingDocument.Open("D:\\Test.docx", true))
        {

            MainDocumentPart mainPart = myDoc.MainDocumentPart;
            string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 3);
            AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
            using (FileStream fileStream = File.Open("d:\\Test1.docx", FileMode.Open))
            {
                chunk.FeedData(fileStream);
                AltChunk altChunk = new AltChunk();
                altChunk.Id = altChunkId;
                mainPart.Document
                    .Body
                    .InsertAfter(altChunk, mainPart.Document.Body
                    .Elements<Paragraph>().Last());
                mainPart.Document.Save();
            }
            using (FileStream fileStream = File.Open("d:\\Test2.docx", FileMode.Open))
            {
                chunk.FeedData(fileStream);
                AltChunk altChunk = new AltChunk();
                altChunk.Id = altChunkId;
                mainPart.Document
                    .Body
                    .InsertAfter(altChunk, mainPart.Document.Body
                    .Elements<Paragraph>().Last());
            }
            using (FileStream fileStream = File.Open("d:\\Test3.docx", FileMode.Open))
            {
                chunk.FeedData(fileStream);
                AltChunk altChunk = new AltChunk();
                altChunk.Id = altChunkId;
                mainPart.Document
                    .Body
                    .InsertAfter(altChunk, mainPart.Document.Body
                    .Elements<Paragraph>().Last());
            } 
        }

此代碼附加了兩次 Test2 數據,也代替了 Test1 數據。 意味着我得到:

Test
Test2
Test2

代替:

Test
Test1
Test2

僅使用 openXML SDK,您可以使用AltChunk元素將多個文檔合並為一個。

這個鏈接the-easy-way-to-assemble-multiple-word-documents和這個How to Use altChunk for Document Assembly提供了一些示例。

編輯 1

根據您在更新的問題(update#1)中使用altchunk的代碼,這是我測試過的 VB.Net 代碼,它對我來說就像一個魅力:

Using myDoc = DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open("D:\\Test.docx", True)
        Dim altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2)
        Dim mainPart = myDoc.MainDocumentPart
        Dim chunk = mainPart.AddAlternativeFormatImportPart(
            DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML, altChunkId)
        Using fileStream As IO.FileStream = IO.File.Open("D:\\Test1.docx", IO.FileMode.Open)
            chunk.FeedData(fileStream)
        End Using
        Dim altChunk = New DocumentFormat.OpenXml.Wordprocessing.AltChunk()
        altChunk.Id = altChunkId
        mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Elements(Of DocumentFormat.OpenXml.Wordprocessing.Paragraph).Last())
        mainPart.Document.Save()
End Using

編輯 2

第二期(更新#2)

此代碼附加了兩次 Test2 數據,也代替了 Test1 數據。

altchunkid相關。

對於要在主文檔中合並的每個文檔,您需要:

  1. mainDocumentPart中添加一個AlternativeFormatImportPart ,其Id必須是唯一的。 此元素包含插入的數據
  2. 在正文中添加一個Altchunk元素,您可以在其中設置id以引用之前的AlternativeFormatImportPart

在您的代碼中,您對所有AltChunks使用相同的 Id。 這就是為什么您會多次看到相同的文本。

我不確定您的代碼中的 altchunkid 是否是唯一的: string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2);

如果您不需要設置特定值,我建議您在添加AlternativeFormatImportPart時不要顯式設置AltChunkId 相反,您會得到一個由 SDK 生成的,如下所示:

VB.Net

Dim chunk As AlternativeFormatImportPart = mainPart.AddAlternativeFormatImportPart(DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML)
Dim altchunkid As String = mainPart.GetIdOfPart(chunk)

C#

AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML);
string altchunkid = mainPart.GetIdOfPart(chunk);

有一個很好的包裝器 API (Document Builder 2.2) 圍繞打開 xml 專門設計用於合並文檔,可以靈活地選擇要合並的段落等。你可以從這里下載它(更新:移動到ZBF21518173B514B73544 AZ514073543F

有關如何使用它的文檔和屏幕截圖在這里

更新:代碼示例

 var sources = new List<Source>();
 //Document Streams (File Streams) of the documents to be merged.
 foreach (var stream in documentstreams)
 {
        var tempms = new MemoryStream();
        stream.CopyTo(tempms);
        sources.Add(new Source(new WmlDocument(stream.Length.ToString(), tempms), true));
 }

  var mergedDoc = DocumentBuilder.BuildDocument(sources);
  mergedDoc.SaveAs(@"C:\TargetFilePath");

SourceWmlDocument類型來自 Document Builder API。

如果您選擇以下方式,您甚至可以直接添加文件路徑:

sources.Add(new Source(new WmlDocument(@"C:\FileToBeMerged1.docx"));
sources.Add(new Source(new WmlDocument(@"C:\FileToBeMerged2.docx"));

發現AltChunkDocument Builder合並文檔方法之間的不錯比較- 有助於根據需求進行選擇。

您也可以使用DocX庫來合並文檔,但我更喜歡 Document Builder 來合並文檔。

希望這可以幫助。

這些答案中唯一缺少的是for循環。

對於那些只想復制/粘貼它的人:

void MergeInNewFile(string resultFile, IList<string> filenames)
{
    using (WordprocessingDocument document = WordprocessingDocument.Create(resultFile, WordprocessingDocumentType.Document))
    {
        MainDocumentPart mainPart = document.AddMainDocumentPart();
        mainPart.Document = new Document(new Body());

        foreach (string filename in filenames)
        {
            AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML);
            string altChunkId = mainPart.GetIdOfPart(chunk);

            using (FileStream fileStream = File.Open(filename, FileMode.Open))
            {
                chunk.FeedData(fileStream);
            }

            AltChunk altChunk = new AltChunk { Id = altChunkId };
            mainPart.Document.Body.AppendChild(altChunk);
        }

        mainPart.Document.Save();
    }
}

所有學分 go 致 Chris 和 yonexbat

在 C# 中易於使用:

using System;
using System.IO;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

namespace WordMergeProject
{
    public class Program
    {
        private static void Main(string[] args)
        {
            byte[] word1 = File.ReadAllBytes(@"..\..\word1.docx");
            byte[] word2 = File.ReadAllBytes(@"..\..\word2.docx");

            byte[] result = Merge(word1, word2);

            File.WriteAllBytes(@"..\..\word3.docx", result);
        }

        private static byte[] Merge(byte[] dest, byte[] src)
        {
            string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString();

            var memoryStreamDest = new MemoryStream();
            memoryStreamDest.Write(dest, 0, dest.Length);
            memoryStreamDest.Seek(0, SeekOrigin.Begin);
            var memoryStreamSrc = new MemoryStream(src);

            using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStreamDest, true))
            {
                MainDocumentPart mainPart = doc.MainDocumentPart;
                AlternativeFormatImportPart altPart =
                    mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
                altPart.FeedData(memoryStreamSrc);
                var altChunk = new AltChunk();
                altChunk.Id = altChunkId;
                              OpenXmlElement lastElem = mainPart.Document.Body.Elements<AltChunk>().LastOrDefault();
            if(lastElem == null)
            {
                lastElem = mainPart.Document.Body.Elements<Paragraph>().Last();
            }


            //Page Brake einfügen
            Paragraph pageBreakP = new Paragraph();
            Run pageBreakR = new Run();
            Break pageBreakBr = new Break() { Type = BreakValues.Page };

            pageBreakP.Append(pageBreakR);
            pageBreakR.Append(pageBreakBr);                

            return memoryStreamDest.ToArray();
        }
    }
}

我的解決方案:

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;

namespace TestFusionWord
{
    internal class Program
    {
        public static void MergeDocx(List<string> ListPathFilesToMerge, string DestinationPathFile, bool OverWriteDestination, bool WithBreakPage)
        {
            #region Control arguments

            List<string> ListError = new List<string>();
            if (ListPathFilesToMerge == null || ListPathFilesToMerge.Count == 0)
            {
                ListError.Add("Il n'y a aucun fichier à fusionner dans la liste passée en paramètre ListPathFilesToMerge");
            }
            else
            {
                foreach (var item in ListPathFilesToMerge.Where(x => Path.GetExtension(x.ToLower()) != ".docx"))
                {
                    ListError.Add(string.Format("Le fichier '{0}' indiqué dans la liste passée en paramètre ListPathFilesToMerge n'a pas l'extension .docx", item));
                }

                foreach (var item in ListPathFilesToMerge.Where(x => !File.Exists(x)))
                {
                    ListError.Add(string.Format("Le fichier '{0}' indiqué dans la liste passée en paramètre ListPathFilesToMerge n'existe pas", item));
                }
            }

            if (string.IsNullOrWhiteSpace(DestinationPathFile))
            {
                ListError.Add("Le fichier destination FinalPathFile passé en paramètre ne peut être vide");
            }
            else
            {
                if (Path.GetExtension(DestinationPathFile.ToLower()) != ".docx")
                {
                    ListError.Add(string.Format("Le fichier destination '{0}' indiqué dans le paramètre DestinationPathFile n'a pas l'extension .docx", DestinationPathFile));
                }

                if (File.Exists(DestinationPathFile) && !OverWriteDestination)
                {
                    ListError.Add(string.Format("Le fichier destination '{0}' existe déjà. Utilisez l'argument OverWriteDestination si vous souhaitez l'écraser", DestinationPathFile));
                }
            }

            if (ListError.Any())
            {
                string MessageError = "Des erreurs ont été rencontrés, détail : " + Environment.NewLine + ListError.Select(x => "- " + x).Aggregate((x, y) => x + Environment.NewLine + y);
                throw new ArgumentException(MessageError);
            }

            #endregion Control arguments

            #region Merge Files

            //Suppression du fichier destination (aucune erreur déclenchée si le fichier n'existe pas)
            File.Delete(DestinationPathFile);

            //Création du fichier destination à vide
            using (WordprocessingDocument document = WordprocessingDocument.Create(DestinationPathFile, WordprocessingDocumentType.Document))
            {
                MainDocumentPart mainPart = document.AddMainDocumentPart();
                mainPart.Document = new Document(new Body());
                document.MainDocumentPart.Document.Save();
            }

            //Fusion des documents
            using (WordprocessingDocument myDoc = WordprocessingDocument.Open(DestinationPathFile, true))
            {
                MainDocumentPart mainPart = myDoc.MainDocumentPart;
                Body body = mainPart.Document.Body;

                for (int i = 0; i < ListPathFilesToMerge.Count; i++)
                {
                    string currentpathfile = ListPathFilesToMerge[i];
                    AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML);
                    string altchunkid = mainPart.GetIdOfPart(chunk);

                    using (FileStream fileStream = File.Open(currentpathfile, FileMode.Open))
                        chunk.FeedData(fileStream);

                    AltChunk altChunk = new AltChunk();
                    altChunk.Id = altchunkid;

                    OpenXmlElement last = body.Elements().LastOrDefault(e => e is AltChunk || e is Paragraph);
                    body.InsertAfter(altChunk, last);

                    if (WithBreakPage && i < ListPathFilesToMerge.Count - 1) // If its not the last file, add breakpage
                    {
                        last = body.Elements().LastOrDefault(e => e is AltChunk || e is Paragraph);
                        last.InsertAfterSelf(new Paragraph(new Run(new Break() { Type = BreakValues.Page })));
                    }
                }

                mainPart.Document.Save();
            }

            #endregion Merge Files
        }

        private static int Main(string[] args)
        {
            try
            {
                string DestinationPathFile = @"C:\temp\testfusion\docfinal.docx";

                List<string> ListPathFilesToMerge = new List<string>()
                                    {
                                        @"C:\temp\testfusion\fichier1.docx",
                                        @"C:\temp\testfusion\fichier2.docx",
                                        @"C:\temp\testfusion\fichier3.docx"
                                    };

                ListPathFilesToMerge.Sort(); //Sort for always have the same file

                MergeDocx(ListPathFilesToMerge, DestinationPathFile, true, true);

#if DEBUG
                Process.Start(DestinationPathFile); //open file
#endif
                return 0;
            }
            catch (Exception Ex)
            {
                Console.Error.WriteLine(Ex.Message);
                //Log exception here
                return -1;
            }
            

        }
    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM