带HTML字符串C＃的iTextSharp PDF标头

Question

I'm trying to generate PDF reports using iTextSharp with customer information, header and footer etc. All these reports are already generated using EVO APIs. 我正在尝试使用iTextSharp以及客户信息，页眉和页脚等生成PDF报告。所有这些报告已经使用EVO API生成。 As part of a migration process, we are planning to generate these reports using iTextSharp APIs. 作为迁移过程的一部分，我们计划使用iTextSharp API生成这些报告。

I need to know if there is any possibility to provide a ready to render HTML string to iTextSharp PDF header (Existing EVO design accepts HTML string and build PDF), instead of using PageEvents to design with PDFPTable and PDFPCell (as the number of reports are huge and to avoid rework) 我需要知道是否有可能为iTextSharp PDF标头提供准备好呈现HTML字符串的功能（现有的EVO设计接受HTML字符串并构建PDF），而不是使用PageEvents来设计PDFPTable和PDFPCell（因为报告的数量是庞大并避免返工）

Answer 1

I need to know if there is any possibility to provide a ready to render HTML string to iTextSharp PDF header (Existing EVO design accepts HTML string and build PDF), instead of using PageEvents to design with PDFPTable and PDFPCell 我需要知道是否有可能为iTextSharp PDF标头提供现成的HTML字符串渲染（现有的EVO设计接受HTML字符串并生成PDF），而不是使用PageEvents来设计PDFPTable和PDFPCell

You will have to use page events to draw header or footers but there is no need to use PdfPTable explicitly there. 您将必须使用页面事件来绘制页眉或页脚，但无需在PdfPTable显式使用PdfPTable 。 You actually can render html during a page event, eg like this: 实际上，您可以在页面事件期间呈现html，例如：

[Test]
public void CreatePdfWithHtmlHeader()
{
    string htmlHeader = "<!DOCTYPE html><html><body><table style=\"width: 100%; border: 1px solid black;\"><tr><td>A</td><td>B</td></tr></table></body></html>";

    using (FileStream output = new FileStream(@"C:\Temp\test-results\content\html-header.pdf", FileMode.Create, FileAccess.Write))
    using (Document document = new Document(PageSize.A4))
    {
        PdfWriter writer = PdfWriter.GetInstance(document, output);
        writer.PageEvent = new HtmlPageEventHelper(htmlHeader);
        document.Open();
        document.Add(new Paragraph("1"));
        document.NewPage();
        document.Add(new Paragraph("2"));
    }
}

making use the following two small helper classes. 使用以下两个小帮助程序类。

HtmlPageEventHelper is a page event listener drawing a given html sniplet into the page header. HtmlPageEventHelper是一个页面事件侦听器，它将给定的html代码片段绘制到页面标题中。 Obviously it can alternatively or additionally write into the page footer, simply use appropriate column coordinates 显然，它可以替代或附加地写入页面页脚，只需使用适当的列坐标

public class HtmlPageEventHelper : PdfPageEventHelper
{
    public HtmlPageEventHelper(string html)
    {
        this.html = html;
    }

    public override void OnEndPage(PdfWriter writer, Document document)
    {
        base.OnEndPage(writer, document);

        ColumnText ct = new ColumnText(writer.DirectContent);
        XMLWorkerHelper.GetInstance().ParseXHtml(new ColumnTextElementHandler(ct), new StringReader(html));
        ct.SetSimpleColumn(document.Left, document.Top, document.Right, document.GetTop(-20), 10, Element.ALIGN_MIDDLE);
        ct.Go();
    }

    string html = null;
}

For more complex HTML sniplets you may want to replace the XMLWorkerHelper.GetInstance().ParseXHtml call by a customized parser call as presented in @Skary's answer. 对于更复杂的HTML代码段，您可能希望用@Skary的答案中所示的自定义解析器调用替换XMLWorkerHelper.GetInstance().ParseXHtml调用。

ColumnTextElementHandler is an IElementHandler implementation that adds content (generated eg by parsing HTML) to a ColumnText ColumnTextElementHandler是IElementHandler实现，它将内容（例如，通过解析HTML生成）添加到ColumnText

public class ColumnTextElementHandler : IElementHandler
{
    public ColumnTextElementHandler(ColumnText ct)
    {
        this.ct = ct;
    }

    ColumnText ct = null;

    public void Add(IWritable w)
    {
        if (w is WritableElement)
        {
            foreach (IElement e in ((WritableElement)w).Elements())
            {
                ct.AddElement(e);
            }
        }
    }
}

By the way, the test above produces a PDF with this content: 顺便说一句，上面的测试产生了具有以下内容的PDF：

... ...

_{Disclaimer: I predominantly work with Java and have not used the XmlWorker before.} _{免责声明：我主要使用Java，并且以前没有使用过XmlWorker 。} _{Thus, this code may have considerable potential for improvement.} _{因此，该代码可能具有很大的改进潜力。}

Answer 2

I am not sure to have understand you question right. 我不确定您是否理解正确。

If you are asking how to parse HTML to PDF using iTextSharp here is the solutin i found time ago : 如果您问如何使用iTextSharp将HTML解析为PDF，这是我前一段时间发现的解决方案：

        using (Document document = new Document(size))
        {
            var writer = PdfWriter.GetInstance(document, stream);

            document.Open();
            document.NewPage();
            document.Add(new Chunk(""));

            var tagProcessors = (DefaultTagProcessorFactory)Tags.GetHtmlTagProcessorFactory();
            tagProcessors.RemoveProcessor(HTML.Tag.IMG);
            tagProcessors.AddProcessor(HTML.Tag.IMG, new CustomImageTagProcessor());

            var charset = Encoding.UTF8;

            CssFilesImpl cssFiles = new CssFilesImpl();
            cssFiles.Add(XMLWorkerHelper.GetInstance().GetDefaultCSS());
            var cssResolver = new StyleAttrCSSResolver(cssFiles);
            cssResolver.AddCss(srcCssData, "utf-8", true);

            var hpc = new HtmlPipelineContext(new CssAppliersImpl(new XMLWorkerFontProvider()));
            hpc.SetAcceptUnknown(true).AutoBookmark(true).SetTagFactory(tagProcessors);
            var htmlPipeline = new HtmlPipeline(hpc, new PdfWriterPipeline(document, writer));
            var pipeline = new CssResolverPipeline(cssResolver, htmlPipeline);
            var worker = new XMLWorker(pipeline, true);
            var xmlParser = new XMLParser(true, worker, charset);

            xmlParser.Parse(new StringReader(srcFileData));

            document.Close();
        }

To get it work you need to add custom image processor to inline image in the HTML you provide to tha above converte function : 为了使其正常工作，您需要在converte函数上方提供的HTML中添加自定义图像处理器以内联图像：

public class CustomImageTagProcessor : iTextSharp.tool.xml.html.Image
{
    public override IList<IElement> End(IWorkerContext ctx, Tag tag, IList<IElement> currentContent)
    {
        IDictionary<string, string> attributes = tag.Attributes;
        string src;
        if (!attributes.TryGetValue(HTML.Attribute.SRC, out src))
            return new List<IElement>(1);

        if (string.IsNullOrEmpty(src))
            return new List<IElement>(1);

        if (src.StartsWith("data:image/", StringComparison.InvariantCultureIgnoreCase))
        {
            // data:[<MIME-type>][;charset=<encoding>][;base64],<data>
            var base64Data = src.Substring(src.IndexOf(",") + 1);
            var imagedata = Convert.FromBase64String(base64Data);
            var image = iTextSharp.text.Image.GetInstance(imagedata);

            var list = new List<IElement>();
            var htmlPipelineContext = GetHtmlPipelineContext(ctx);
            list.Add(GetCssAppliers().Apply(new Chunk((iTextSharp.text.Image)GetCssAppliers().Apply(image, tag, htmlPipelineContext), 0, 0, true), tag, htmlPipelineContext));
            return list;
        }
        else
        {
            return base.End(ctx, tag, currentContent);
        }
    }
}

带HTML字符串C＃的iTextSharp PDF标头

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-12-22 11:32:11

解决方案2
2 2015-12-22 07:37:45

带HTML字符串C＃的iTextSharp PDF标头

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-12-22 11:32:11

解决方案2 2 2015-12-22 07:37:45

解决方案1
3 已采纳 2015-12-22 11:32:11

解决方案2
2 2015-12-22 07:37:45