简体   繁体   中英

Html Text Content to Word using OpenXml

I have a rich text box which contains html formatted text as well as we can insert a copied images. I tried with AlternativeFormatImportPart and AltChunk method. It's generating the document but getting the below error. Please let me know what am I missing here.

在此处输入图片说明 在此处输入图片说明

  
 MemoryStream ms;// = new MemoryStream(new UTF8Encoding(true).GetPreamble().Concat(Encoding.UTF8.GetBytes(h)).ToArray()); ms = new MemoryStream(HtmlToWord(fileContent)); //MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(h)); // Create alternative format import part. AlternativeFormatImportPart chunk = mainDocPart.AddAlternativeFormatImportPart( "application/xhtml+xml", altChunkId); chunk.FeedData(ms); AltChunk altChunk = new AltChunk(); altChunk.Id = altChunkId; 

 public static byte[] HtmlToWord(String html) { const string filename = "test.docx"; if (File.Exists(filename)) File.Delete(filename); var doc = new Document(); using (MemoryStream generatedDocument = new MemoryStream()) { using (WordprocessingDocument package = WordprocessingDocument.Create( generatedDocument, WordprocessingDocumentType.Document)) { MainDocumentPart mainPart = package.MainDocumentPart; if (mainPart == null) { mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); } HtmlConverter converter = new HtmlConverter(mainPart); converter.ExcludeLinkAnchor = true; converter.RefreshStyles(); converter.ImageProcessing = ImageProcessing.AutomaticDownload; //converter.BaseImageUrl = new Uri(domainNameURL + "Images/"); converter.ConsiderDivAsParagraph = false; Body body = mainPart.Document.Body; var paragraphs = converter.Parse(html); for (int i = 0; i < paragraphs.Count; i++) { body.Append(paragraphs[i]); } mainPart.Document.Save(); } return generatedDocument.ToArray(); } } 

There are some issues in AlternativeFormatImportPart with MemoryStream, document is not getting formatted well. So followed an alternate approach, using HtmlToWord method saved the html content into word and read the file content using FileStream and feed the AlternativeFormatImportPart.

string docFileName;
HtmlToWord(fileContent, out docFileName);
FileStream fileStream = File.Open(docFileName, FileMode.Open);                
// Create alternative format import part.
AlternativeFormatImportPart chunk =mainDocPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM