简体   繁体   English

OpenXml从Word文档转换为带标题的HTML

[英]OpenXml Convert from Word document to HTML with Header

I want to read a .docx file and send its content in Email as email body not as an attachment. 我想阅读.docx文件并将其内容作为电子邮件正文发送给电子邮件正文而不是附件。

So for this I use openXML and OpenXmlPowerTools to convert docx file to html. 因此,我使用openXML和OpenXmlPowerTools将docx文件转换为html。 This is almost working fine until i got a document which has Header and Footer with images. 这几乎正​​常工作,直到我得到一个带有图像的页眉页脚的文档。

Here is my code to convert .docx to Html 这是我将.docx转换为Html的代码

 using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, true))
                {
                    HtmlConverterSettings convSettings = new HtmlConverterSettings()
                    {
                        FabricateCssClasses = true,
                        CssClassPrefix = "cls-",
                        RestrictToSupportedLanguages = false,
                        RestrictToSupportedNumberingFormats = false,
                        ImageHandler = imageInfo =>
                        {
                            DirectoryInfo localDirInfo = new DirectoryInfo(imageDirectoryName);
                            if (!localDirInfo.Exists)
                            {
                                localDirInfo.Create();
                            }

                            ++imageCounter;
                            string extension = imageInfo.ContentType.Split('/')[1].ToLower();
                            ImageFormat imageFormat = null;
                            if (extension == "png")
                            {
                                extension = "jpeg";
                                imageFormat = ImageFormat.Jpeg;
                            }
                            else if (extension == "bmp")
                            {
                                imageFormat = ImageFormat.Bmp;
                            }
                            else if (extension == "jpeg")
                            {
                                imageFormat = ImageFormat.Jpeg;
                            }
                            else if (extension == "tiff")
                            {
                                imageFormat = ImageFormat.Tiff;
                            }

                            // If the image format is not one that you expect, ignore it,
                            // and do not return markup for the link.
                            if (imageFormat == null)
                            {
                                return null;
                            }

                            string imageFileName = imageDirectoryName + "/image" + imageCounter.ToString() + "." + extension;

                            try
                            {
                                imageInfo.Bitmap.Save(imageFileName, imageFormat);
                            }
                            catch (System.Runtime.InteropServices.ExternalException)
                            {
                                return null;
                            }

                            XElement img = new XElement(Xhtml.img, new XAttribute(NoNamespace.src, imageFileName), imageInfo.ImgStyleAttribute, imageInfo.AltText != null ? new XAttribute(NoNamespace.alt, imageInfo.AltText) : null);
                            return img;
                        }
                    };

                    XElement html = OpenXmlPowerTools.HtmlConverter.ConvertToHtml(doc1, convSettings);

Above code works fine, convert images as well, but if the document has header and footer those are not converted. 上面的代码工作正常,转换图像,但如果文档有页眉和页脚,那么不会转换。

So is their any workaround to include header and footer in html file. 他们的任何解决方法都是在html文件中包含页眉和页脚。

Please suggest me. 请建议我。 Thanks! 谢谢!

OpenXmlPowerTools ignores headers and footers when converting a docx-document to HTML, so they won't show up in the resulting HTML (you can browse the source code on github). 将docx文档转换为HTML时,OpenXmlPowerTools会忽略页眉和页脚,因此它们不会显示在生成的HTML中(您可以在github上浏览源代码 )。

Perhaps it's because the concept of a 'page' doesn't apply to HTML, so there's no obvious equivalent to a document header. 也许是因为“页面”的概念不适用于HTML,因此没有明显的等同于文档标题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM