使用 openxml 拆分 docx 后，Word 在 xxx.docx 中发现不可读的内容

Question

I have a full.docx which includes two math questions, the docx embeds some pictures and MathType equation (oleobject), I split the doc according to this , get two files (first.docx, second.docx) , first.docx works fine, the second.docx, however, pops up a warning dialog when I try to open it:我有一个 full.docx，其中包括两个数学问题，docx 嵌入了一些图片和 MathType 方程（oleobject），我根据这个拆分了 doc，得到两个文件（first.docx，second.docx），first.docx 工作正常，但是，当我尝试打开它时，第二个.docx 会弹出一个警告对话框：

"Word found unreadable content in second.docx. Do you want to recover the contents of this document? If you trust the source of this document, click Yes."

After click "Yes", the doc can be opened, the content is also correct, I want to know what is wrong with the second.docx?点击“是”后，可以打开文档，内容也正确，我想知道第二个.docx有什么问题？ I have checked it with "Open xml sdk 2.5 productivity tool", but found no reason.我已经用“Open xml sdk 2.5生产力工具”检查过，但没有找到原因。 Very appreciated for any help.非常感谢您的帮助。 Thanks.谢谢。

The three files have been uploaded to here .这三个文件已经上传到这里。

Show some code:显示一些代码：

        byte[] templateBytes = System.IO.File.ReadAllBytes(TEMPLATE_YANG_FILE);
        using (MemoryStream templateStream = new MemoryStream())
        {
            templateStream.Write(templateBytes, 0, (int)templateBytes.Length);

            string guidStr = Guid.NewGuid().ToString();

            using (WordprocessingDocument document = WordprocessingDocument.Open(templateStream, true))
            {
                document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);

                MainDocumentPart mainPart = document.MainDocumentPart;

                mainPart.Document = new Document();
                Body bd = new Body();

                foreach (DocumentFormat.OpenXml.Wordprocessing.Paragraph clonedParagrph in lst)
                {
                    bd.AppendChild<DocumentFormat.OpenXml.Wordprocessing.Paragraph>(clonedParagrph);

                    clonedParagrph.Descendants<Blip>().ToList().ForEach(blip =>
                    {
                        var newRelation = document.CopyImage(blip.Embed, this.wordDocument);
                        blip.Embed = newRelation;
                    });

                    clonedParagrph.Descendants<DocumentFormat.OpenXml.Vml.ImageData>().ToList().ForEach(imageData =>
                    {
                        var newRelation = document.CopyImage(imageData.RelationshipId, this.wordDocument);
                        imageData.RelationshipId = newRelation;
                    });
                }

                mainPart.Document.Body = bd;
                mainPart.Document.Save();
            }

            string subDocFile = System.IO.Path.Combine(this.outDir, guidStr + ".docx");
            this.subWordFileLst.Add(subDocFile);

            File.WriteAllBytes(subDocFile, templateStream.ToArray());
        }

the lst contains Paragraph cloned from original docx using: lst 包含从原始 docx 克隆的段落，使用：

(DocumentFormat.OpenXml.Wordprocessing.Paragraph)p.Clone();

Answer 1

Using productivity tool, found oleobjectx.bin not copied, so I add below code after copy Blip and ImageData:使用生产力工具，发现oleobjectx.bin没有复制，所以我在复制Blip和ImageData后添加以下代码：

clonedParagrph.Descendants<OleObject>().ToList().ForEach(ole =>
{
    var newRelation = document.CopyOleObject(ole.Id, this.wordDocument);
    ole.Id = newRelation;
});

Solved the issue.解决了这个问题。

使用 openxml 拆分 docx 后，Word 在 xxx.docx 中发现不可读的内容

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-02-13 14:23:26

使用 openxml 拆分 docx 后，Word 在 xxx.docx 中发现不可读的内容

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-02-13 14:23:26

解决方案1
0 已采纳 2020-02-13 14:23:26