简体   繁体   中英

how to convert .docx to html file in c# and save it respective directory?

I have below format .docx format. i want to convert .docx to .html save it in below path:

I have more that 200 .docx file. It's very harder to change into .html manually.

.docx format:

<START>
<TITLE>UAE0d23376</TITLE>
<BODY>
<P>3376</P>
<P>
urged that he should be sent to saint winifreds, with some vague notion of making a man of him. he<br>might as well have thrown a piece of brussels lace into the fire with intention of changing it into<br>you want be troubled with this one long, said her son; ill go with me, and that's soon 
</P>
</BODY>
<END>

Nead to change to .html and save to under "c:\\ConvertedToHTML"

can you please help me to solve this.

Convert .docx file to HTML format

Add reference to OpenXmlPowerTools.dll Code :

using OpenXmlPowerTools;
using DocumentFormat.OpenXml.Wordprocessing;

byte[] byteArray = File.ReadAllBytes(DocxFilePath);
using (MemoryStream memoryStream = new MemoryStream())
{
    memoryStream.Write(byteArray, 0, byteArray.Length);
    using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStream, true))
 {
      HtmlConverterSettings settings = new HtmlConverterSettings()
      {
           PageTitle = "My Page Title"
      };
      XElement html = HtmlConverter.ConvertToHtml(doc, settings);

      File.WriteAllText(HTMLFilePath, html.ToStringNewLineOnAttributes());
 }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM