简体   繁体   English

以编程方式修复“Word 在损坏中发现不可读的内容...”

[英]Fixing "Word found unreadable content in corrupt..." programmatically

I'm getting a OpenXml generated docx file from another system.我正在从另一个系统获取 OpenXml 生成的 docx 文件。 When try using open the file in my application using Microsoft.Office.Interop.Word.Application.Open(filename) I get a The file appears to be corrupted exception.当尝试使用Microsoft.Office.Interop.Word.Application.Open(filename)在我的应用程序中打开文件时,我得到一个The file appears to be corrupted异常。

When I manually open the docx file I'm greeted with a Word found unreadable content in corrupt xxx.docx. Do you want to recover the contents of this document? If you trust the source of this document, click Yes.当我手动打开 docx 文件时,我看到一个Word found unreadable content in corrupt xxx.docx. Do you want to recover the contents of this document? If you trust the source of this document, click Yes. Word found unreadable content in corrupt xxx.docx. Do you want to recover the contents of this document? If you trust the source of this document, click Yes. prompt.迅速的。 When I click Yes , it is able to recover the document in a new unsaved Word file.当我单击Yes ,它能够在一个新的未保存的 Word 文件中恢复该文档。

I have tried comparing the previous corrupt.docx file's document.xml with the recovered.docx file's document.xml.我曾尝试将之前的corruption.docx 文件的document.xml 与recovered.docx 文件的document.xml 进行比较。 While there are many of formatting changes between the two document.xmls (extra space between closing xml-tags), the main difference was the AltChunk actually was embedded into the recovered.docx and there were several empty "run" tags that got removed.虽然两个 document.xmls 之间有许多格式更改(关闭 xml-tags 之间的额外空间),主要区别在于 AltChunk 实际上嵌入到了 recovery.docx 中,并且有几个空的“run”标签被删除了。 I'm not sure what would be causing the file to be considered corrupt as those don't seem like they should.我不确定是什么导致文件被认为已损坏,因为这些文件似乎不应该损坏。

That said, is there a way to run whatever process happens when I click Yes to that ...Do you want to recover the contents of this document?... prompt programatically through my application;也就是说,当我单击“ Yes时,有没有办法运行发生的任何进程...Do you want to recover the contents of this document?...通过我的应用程序以编程方式提示; this would be the ideal?这会是理想的吗? Less preferably, is there a way to tell what parts of the xml is actually corrupting in a word doc?不太好,有没有办法告诉 xml 的哪些部分实际上在 word doc 中损坏了?

That said, is there a way to run whatever process happens when I click Yes to that ...Do you want to recover the contents of this document?... prompt programnatically through my application;也就是说,有没有办法运行当我单击“是”时发生的任何进程...是否要恢复此文档的内容?...通过我的应用程序以编程方式提示; this would be the ideal?这会是理想的吗? Less preferably, is there a way to tell what parts of the xml is actually corrupting in a word doc?不太好,有没有办法告诉 xml 的哪些部分实际上在 word doc 中损坏了?

  1. No, that's not exposed to the outside不,那不是暴露在外面
  2. Theoretically, validation could be possible.理论上,验证是可能的。 But given there's an AltChunk involved, that might not turn up the problem.但考虑到涉及 AltChunk,这可能不会出现问题。 The content of AltChunk isn't integrated until Word processes the document, at which time it's integrated. AltChunk 的内容在 Word 处理文档之前不会集成,此时它已集成。 And if what's coming in "breaks" something, the validation won't pick that up.如果出现的内容“破坏”了某些内容,则验证将无法识别。

In this particular case, I might try removing the AltChunk manually (the pieces are in a few places in the zip file) and see if the file can open without it.在这种特殊情况下,我可能会尝试手动删除 AltChunk(这些部分位于 zip 文件中的几个位置)并查看文件是否可以在没有它的情况下打开。 But if you're not intimately familiar with the Word Open XML zip package it might be better to ask the producer/source of the document.但是,如果您对 Word Open XML zip 包不是很熟悉,最好询问文档的制作者/来源。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM