简体   繁体   English

无法使用 Blob 对象在客户端打开`docx` 文件 - vanilla JavaScript

[英]Unable to open `docx` files client-side using a Blob object - vanilla JavaScript

This is the code client-side, it's a minimum, complete and verifiable snippet that will allow fellow developers to test this by themselves.这是客户端的代码,它是一个最小的、完整的和可验证的片段,允许其他开发人员自行测试。

// requires: a string that contains html tags
// returns: a word document that can be downloaded with extension .doc or docx
// @ param cvAsHTML is a string that contains html tags

const preHtml = "<html xmlns:v='urn:schemas-microsoft-com:vml' xmlns:o='urn:schemas-microsoft-com:office:office' xmlns:w='urn:schemas-microsoft-com:office:word' xmlns='http://www.w3.org/TR/html4/loose.dtd\'><head><meta charset='utf-8'></head><body>";
const postHtml = "</body></html>";
const html = preHtml + cvAsHTML + postHtml;

let filename = "filename";
const blob = new Blob(["\ufeff", html], { type: "application/msword"});

The above snippet works like a charm.上面的代码片段就像一个魅力。 Please note that the XML schemas are redundant and actually unnecessary.请注意,XML 模式是多余的,实际上是不必要的。 The doc file would work without them but head and body tags must be present. doc 文件可以在没有它们的情况下工作,但必须存在 head 和 body 标签。

For docx files I am unable to download the file.对于docx文件,我无法下载该文件。 The file appears to be corrupted and after several trials I really do not know what to do.该文件似乎已损坏,经过多次试验后,我真的不知道该怎么办。 This is the code for docx files:这是 docx 文件的代码:

const preHtml = "<?xml version='1.0' encoding='UTF-8?><html xmlns:v='urn:schemas-microsoft-com:vml' xmlns:o='urn:schemas-microsoft-com:office:office' xmlns:w='urn:schemas-microsoft-com:office:word' xmlns='http://www.w3.org/TR/html4/loose.dtd\'><head><meta charset='utf-8'></head><body>";
const postHtml = "</body></html>";
const html = preHtml + cvAsHTML + postHtml;

let filename = "filename.docx";
const blob = new Blob(["\ufeff", html], { type: "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main"});

Note: I have changed the MIME type inside the Blob object and tried different other options as well such as application/zip , application/octet-stream etc. with no avail.注意:我已经更改了 Blob 对象内的 MIME 类型,并尝试了其他不同的选项,例如application/zipapplication/octet-stream等,但无济于事。 I have also changed the prehtml variable to include:我还更改了prehtml变量以包括:

<?xml version='1.0' encoding='UTF-8?>

Given I understand that docx files are essentially zipped files containing xml segments...鉴于我了解 docx 文件本质上是包含 xml 段的压缩文件...

Would really appreciate any help given.非常感谢您提供的任何帮助。

EDIT: 16-Dec-2019编辑:2019 年 12 月 16 日

This is the screenshot I took after the implementation suggested by @dw_:这是我在@dw_ 建议的实现后截取的屏幕截图:

The implementation using JSZip does not work as expected since:使用JSZip的实现无法按预期工作,因为:

  1. The browser does not natively allow the user to open the file in microsoft word, like it does with doc files;浏览器本身不允许用户在 microsoft word 中打开文件,就像打开doc文件一样;
  2. Users must save the file first but even then, the file won't open since it is corrupted.用户必须先保存文件,但即便如此,文件也不会打开,因为它已损坏。

在此处输入图片说明

.docx is a collection of compressed files, using the simplified, minimal DOCX document as a guideline , I have created a ".zip" file containg the main word/document.xml file and 3 additional required files. .docx是压缩文件的集合,使用简化的、最小的 DOCX 文档作为指导,我创建了一个".zip"文件,其中包含主word/document.xml文件和 3 个额外的必需文件。

More information on .docx files can be found here: An Informal Introduction to DOCX可以在此处找到有关.docx文件的更多信息: DOCX 的非正式介绍

 // Other needed files const REQUIRED_FILES = { content_types_xml: `<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"> <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/> <Default Extension="xml" ContentType="application/xml"/> <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/> </Types>`, rels: `<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"> <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/> </Relationships>`, document_xml_rels: `<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"> </Relationships>` }; /// -- const preHtml = `<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 wp14"> <w:body><w:pw:rsidR="005F670F" w:rsidRDefault="005F79F5">`; const postHtml = `<w:bookmarkStart w:id="0" w:name="_GoBack"/> <w:bookmarkEnd w:id="0"/> </w:p> <w:sectPr w:rsidR="005F670F"> <w:pgSz w:w="12240" w:h="15840"/> <w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w:gutter="0"/> <w:cols w:space="720"/> <w:docGrid w:linePitch="360"/> </w:sectPr> </w:body> </w:document>`; const cvAsHTML = `<w:r><w:t>Sample content inside .docx</w:t></w:r>`; const html = preHtml + cvAsHTML + postHtml; function generateDocx(fname) { let zip = new JSZip(); // prerequisites: zip.file("_rels/.rels", REQUIRED_FILES.rels); zip.file("[Content_Types].xml", REQUIRED_FILES.content_types_xml); zip.file("word/_rels/document.xml.rels", REQUIRED_FILES.document_xml_rels); // zip.file("word/document.xml", html); zip.generateAsync({type:"blob"}).then(function(content) { saveAs(content, fname + ".docx"); }); }
 <script src="https://cdn.jsdelivr.net/npm/file-saver@2.0.2/dist/FileSaver.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jszip/3.2.2/jszip.min.js"></script> <button onclick="generateDocx('test_1')">Download .docx</button>

Libraries used使用的库

External Demo (as inline might not work)外部演示(因为内联可能不起作用)

I think it is not so simple.我觉得没那么简单。 Documents with docx extension are indeed zipped, but there is no single zipped file, but specific folder structure and filenames required, see https://en.wikipedia.org/wiki/Office_Open_XML_file_formats .具有 docx 扩展名的文档确实被压缩了,但没有单个压缩文件,但需要特定的文件夹结构和文件名,请参阅https://en.wikipedia.org/wiki/Office_Open_XML_file_formats To be able to dynamically generate the document, you must generate the "minimum structure and files".为了能够动态生成文档,您必须生成“最小结构和文件”。 You can see what I mean by saving empty docx file an unzip it.您可以通过保存空的 docx 文件并将其解压缩来理解我的意思。 Try that in MS Word or LibreOffice or whatever, the structure will be "the same".在 MS Word 或 LibreOffice 或其他任何东西中尝试一下,结构将是“相同的”。

With zipping - maybe this can help https://stuk.github.io/jszip/ can help.通过压缩 - 也许这可以帮助https://stuk.github.io/jszip/可以提供帮助。

With the document itself - I can suggest an approach we used.对于文档本身 - 我可以建议我们使用的方法。 We prepared "a template document" in the office app and put placeholders in it, like $HEADER$, $BODY$, etc.. Then in the program we unzipped it, replace placeholders with real strings and then zipped it to the output.我们在office app里准备了一个“模板文档”,在里面放了占位符,比如$HEADER$、$BODY$等。然后在程序中解压,用真实字符串替换占位符,然后压缩到输出。 It was very effective and practical - we had full control over the final document and it was very easy to change design, colors, static texts - just edit the template and then upload it.它非常有效和实用——我们可以完全控制最终文档,更改设计、颜色、静态文本非常容易——只需编辑模板然后上传即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM