简体   繁体   中英

Convert HTML to PDF using itextsharp

when converting html to pdf using itextsharp the style iam applying with css for the web page is not working in the converted pdf.

here is my css code :

<style type="text/css">
       .cssformat
            {
                width:300px;
                height:200px;
                border:2px solid black;
                background-color:white; 
                border-top-left-radius:60px 90px; 
                border-bottom-right-radius:60px 90px;
        }                
        </style>

here is my html code :

      <div id="divpdf" runat="server">
        <table id="tid" runat="server">
        <tr>
        <td>
       <asp:Label ID="Label1" runat="server" Text="this is new way of pdf" CssClass="cssformat"></asp:Label>
        </td>
        </tr>
        </table>
        </div>

The following is what i have tried with c# :

 Response.ContentType = "application/pdf";
        Response.AddHeader("content-disposition", "attachment;filename=TestPage.pdf");
        Response.Cache.SetCacheability(HttpCacheability.NoCache);
        StringBuilder sb = new StringBuilder();
        StringWriter sw = new StringWriter();
        HtmlTextWriter hw = new HtmlTextWriter(sw);
        Document pdfDoc = new Document(PageSize.A4, 60f, 80f, -2f, 35f);
        divpdf.RenderControl(hw);
        StringReader sr = new StringReader(sw.ToString());   
        HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
        PdfWriter writer = PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
        pdfDoc.Open();
        hw1.Parse(new StringReader(sttt));
        htmlparser.Parse(sr);
        pdfDoc.Close();
        Response.Write(pdfDoc);
        Response.End();
        sw.Close();
        sr.Close();
        hw.Close();

I struggled quite a bit to convert from HTML to PDF using iTextSharp and eventually gave up because I could not get a converted PDF that looked 100% the same as my HTML5/CSS3 page. So I'm giving you the alternative that eventually worked for me.

There is surprisingly very little options available when you are not prepared to pay for a commercial library. I had the same requirement from one of my clients(to convert from HTML to PDF) that did not want to pay for any third party tools, so I had to make a plan. This is what I did, not the best solution, but it got the job done

I downloaded the newest version of wkhtmltopdf . Unfortunately the wkhtmltopdf tool did not display some of my google graphs embedded in my HTML when converting to PDF. So I used the wkhtmltoimage tool also included to convert to a PNG, which woked as expected and displayed all the graphs. I then downloaded the newest version of imagemagick and converted the PNG to PDF. I automated this process using C#.

Unfortunately this is not the most elegant solution because you have to perform two conversions and do a bit of work to automate everything, but this is the best solution I could come up with that gave me the desired results and quality.

Of course there are lots of commercial software out there that will do a faster and better job.

Just a side note:

The web page that I had to convert was devloped in HTML5 and CSS3 using version 3 of bootstrap and it contained some google graphs and charts. Everything was converted without any problems.

Below is the example to convert HTML content containing the inline CSS Code.

public static class PdfCreator {

    public static string ConvertHtmlToPdf(string htmlContent, string fileNameWithoutExtension, string filePath, string cssContent = "") {
        if (!Directory.Exists(filePath)) {
            Directory.CreateDirectory(filePath);
        }

    var fileNameWithPath = Path.Combine(filePath, fileNameWithoutExtension + ".pdf");

    using(var stream = new FileStream(fileNameWithPath, FileMode.Create)) {
        using(var document = new Document()) {
            var writer = PdfWriter.GetInstance(document, stream);
            document.Open();

            // instantiate custom tag processor and add to `HtmlPipelineContext`.
            var tagProcessorFactory = Tags.GetHtmlTagProcessorFactory();
            tagProcessorFactory.AddProcessor(new TableData(), new string[] {
                HTML.Tag.TD
            });
            var htmlPipelineContext = new HtmlPipelineContext(null);
            htmlPipelineContext.SetTagFactory(tagProcessorFactory);

            var pdfWriterPipeline = new PdfWriterPipeline(document, writer);
            var htmlPipeline = new HtmlPipeline(htmlPipelineContext, pdfWriterPipeline);

            // get an ICssResolver and add the custom CSS
            var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
            cssResolver.AddCss(cssContent, "utf-8", true);
            var cssResolverPipeline = new CssResolverPipeline(
            cssResolver, htmlPipeline);

            var worker = new XMLWorker(cssResolverPipeline, true);
            var parser = new XMLParser(worker);
            using(var stringReader = new StringReader(htmlContent)) {
                parser.Parse(stringReader);
            }
        }
    }
    return fileNameWithPath;
    }
}

The output format of <asp:Lable> is "span", which is inline type of display. So change the display to block. Enjoy..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM