[英]How to convert docx to html file using open xml with formatting
I know there are lot of question having same title but I am currently having some issue for them I didn't get the correct way to go. 我知道有很多问题有相同的标题,但我目前有一些问题,他们我没有得到正确的方法去。
I am using Open xml sdk 2.5 along with Power tool to convert .docx
file to .html
file which uses HtmlConverter
class for conversion. 我使用Open xml sdk 2.5和Power工具将.docx
文件转换为.html
文件,该文件使用HtmlConverter
类进行转换。
I am successfully able to convert the docx
file into the Html
file but the problem is, html file doesn't retain the original formatting of the document file. 我成功地将docx
文件转换为Html
文件,但问题是,html文件不保留文档文件的原始格式。 eg. 例如。 Font-size,color,underline,bold etc doesn't reflect into the html file. 字体大小,颜色,下划线,粗体等不会反映到html文件中。
Here is my existing code: 这是我现有的代码:
public void ConvertDocxToHtml(string fileName)
{
byte[] byteArray = File.ReadAllBytes(fileName);
using (MemoryStream memoryStream = new MemoryStream())
{
memoryStream.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStream, true))
{
HtmlConverterSettings settings = new HtmlConverterSettings()
{
PageTitle = "My Page Title"
};
XElement html = HtmlConverter.ConvertToHtml(doc, settings);
File.WriteAllText(@"E:\Test.html", html.ToStringNewLineOnAttributes());
}
}
}
So I just want to know if is there any way by which I can retain the formatting in converted HTML file. 所以我只想知道是否有任何方法可以保留转换后的HTML文件中的格式。
I know about some third party APIs which does the same thing. 我知道一些第三方API做同样的事情。 But I would prefer if there any way using open xml or any other open source to do this. 但我更喜欢使用open xml或任何其他开源来做这件事。
PowerTools for Open XML just released a new HtmlConverter module. PowerTools for Open XML刚刚发布了一个新的HtmlConverter模块。 It now contains an open source, free implementation of a conversion from DOCX to HTML formatted with CSS. 它现在包含一个开源的,免费实现从DOCX到HTML格式的转换。 The module HtmlConverter.cs supports all paragraph, character, and table styles, fonts and text formatting, numbered and bulleted lists, images, and more. 模块HtmlConverter.cs支持所有段落,字符和表格样式,字体和文本格式,编号和项目符号列表,图像等。 See http://bit.ly/1bclyg9 见http://bit.ly/1bclyg9
您可能希望找到一个外部工具来帮助您完成此操作,例如Aspose Words
您的最终结果将与您的Word文档完全不同,但此链接可能会有所帮助。
You can use OpenXML Viewer extension for Firefox for Converting with formatting. 您可以使用OpenXML Viewer扩展程序进行Firefox格式转换。 http://openxmlviewer.codeplex.com This works for me. http://openxmlviewer.codeplex.com这适合我。 Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.