简体   繁体   中英

c# html to docx conversion using Microsoft.Office.Interop

I am converting html to docx using Microsoft.Office.Interop.Word. Also html have img tag. Converted docx files shows imges properly in server but in other machines images does't come.

After investigating i found that images in docx is not embedded as it shows image path of server.

Any help on this would be of great use.

Code is as follows :

Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();
                Microsoft.Office.Interop.Word.Document wordDoc = new Microsoft.Office.Interop.Word.Document();
                Object oMissing = System.Reflection.Missing.Value;
                //wordDoc = word.Documents.Add(ref oMissing, ref oMissing, ref oMissing, ref oMissing);
                word.Visible = false;
                //Object encoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUTF8;
                Object openType = Microsoft.Office.Interop.Word.WdOpenFormat.wdOpenFormatWebPages;
                Object filepath = documentPath;
                word.Documents.Open(FileName: filepath, ReadOnly: false, Format: openType);
                Object confirmconversion = System.Reflection.Missing.Value;
                Object readOnly = false;
                string htmlFileNameWithExtension = Path.GetFileName(documentPath);
                string htmlFileNameWithoutExtension = Path.GetFileNameWithoutExtension(documentPath);

                Object saveto = documentPath.Replace(htmlFileNameWithExtension, htmlFileNameWithoutExtension);

                Object oallowsubstitution = System.Reflection.Missing.Value;


                wordDoc = word.Documents.Open(ref filepath, ref confirmconversion, ref readOnly, ref oMissing,
                                              ref oMissing, ref oMissing, ref oMissing, ref oMissing,
                                              ref oMissing, ref oMissing, ref oMissing, ref oMissing,
                                              ref oMissing, ref oMissing, ref oMissing, ref oMissing);

                object fileFormat = WdSaveFormat.wdFormatDocumentDefault;


                wordDoc.SaveAs(ref saveto, ref fileFormat, ref oMissing, ref oMissing, ref oMissing,
                               ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
                               ref oMissing, ref oMissing, ref oMissing, ref oallowsubstitution, ref oMissing,
                               ref oMissing);


                wordDoc.Close(ref oMissing, ref oMissing, ref oMissing);
                word.Quit(ref oMissing, ref oMissing, ref oMissing);

Below is a simplified example of looping through to set images to save with the document (see SavePictureWithDocument ).

for (var i = 0; i < wordDoc.InlineShapes.Count; i++) {
    if (wordDoc.InlineShapes[i].LinkFormat == null) {
        continue;
    }

    wordDoc.InlineShapes[i].LinkFormat.SavePictureWithDocument = true;
}

So beautiful and logical

for (var i = !!! 1 ; i <= !!! wordDoc.InlineShapes.Count; i++) {
    if (wordDoc.InlineShapes[i].LinkFormat != null) {
        wordDoc.InlineShapes[i].LinkFormat.SavePictureWithDocument = true;
    }    
}

You get a null when you try to access the index i=0 , because numbering is from 1

For Office 2016 for some reason I've received wordDoc.InlineShapes[i] = null, thus I've extend solution with additional check:

for (var i = 0; i < wordDoc.InlineShapes.Count; i++) {
    if (wordDoc.InlineShapes[i].LinkFormat == null) {
        continue;
    }

    wordDoc.InlineShapes[i].LinkFormat.SavePictureWithDocument = true;
}

or same with Lamdbda

document.InlineShapes.ToList().
                    Where(v => v != null && v.LinkFormat != null).ToList().
                    ForEach(v => v.LinkFormat.SavePictureWithDocument = true);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM