简体   繁体   中英

HTML to PDF adding a table of contents (TOC) dynamically

I am using iText 7 to convert html to PDF, but I couldn't find even one example on how to add a table of contents to the final PDF. Converting a text to PDF and adding a TOC is an easy task when following iText example here , but apparently it is not possible when converting HTML to PDF.

The requirement is simple, just find all the HTML tag attributes "data-toc" and use the value to create the TOC

Here is what I got so far using the same approach used in the Text to PDF sample here :

  • VS 2019

Program.cs

using System.IO;

namespace ConsoleApp
{
    class Program
    {
        private const string SRC = @"C:\iText\sxsw.html";
        private const string DEST = @"C:\iText\sxsw.pdf";

        static void Main(string[] args)
        {
            FileInfo file = new FileInfo(DEST);

            file.Directory.Create();

            new HtmlAndToc().CreatePdf(SRC, DEST);
        }
    }
}

HtmlAndTOC.cs

using iText.Html2pdf;
using iText.Html2pdf.Attach;
using iText.Html2pdf.Attach.Impl;
using iText.Html2pdf.Resolver.Font;
using iText.Kernel.Pdf;
using iText.StyledXmlParser.Node;
using System;
using System.IO;

namespace ConsoleApp
{
    class HtmlAndToc
    {
        public void CreatePdf(string src, string dest)
        {
            ConverterProperties properties = new ConverterProperties();
            properties.SetFontProvider(new DefaultFontProvider(true, true, true));

            // Custom Factory
            properties.SetTagWorkerFactory(new CustomTagWorkerFactory());

            PdfWriter writer = new PdfWriter(dest);
            PdfDocument pdfDoc = new PdfDocument(writer);
            pdfDoc.SetTagged();

            FileStream htmlStream = new FileStream(src, FileMode.Open);

            HtmlConverter.ConvertToPdf(htmlStream, pdfDoc, properties);

            htmlStream.Close();
            pdfDoc.Close();
        }
    }

    public class CustomTagWorkerFactory : DefaultTagWorkerFactory
    {
        public override ITagWorker GetCustomTagWorker(IElementNode tag, ProcessorContext context)
        {
            string dataTocAttrValue = tag.GetAttribute("data-toc");

            if (String.IsNullOrWhiteSpace(dataTocAttrValue))
            {
                return null;
            }

            // At this point I have the HTML element and the data-toc attribute value that is a TOC title to be associated to a page number
            // Because the data-toc attribute can be on any HTML element type I couldn't find a way to create an Outline/AddDestination nor 
            // a way to SetNextRenderer to keep the page number updated as the Text to PDF example does in
            // https://itextpdf.com/en/resources/examples/itext-7/toc-first-page

            Console.WriteLine(dataTocAttrValue);

            return null;
        }
    }
}

and here a sample HTML file:

 <!DOCTYPE html> <html> <head> <!-- example from: https://itextpdf.com/ --> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="stylesheet" type="text/css" href="css/sxsw.css"/> <link rel="stylesheet" media="print only" href="css/sxsw_print.css"> <!-- Based on the reponsive design example found at https://www.w3schools.com/css/css_rwd_intro.asp --> </head> <body> <div class="header"> <h1>SXSW</h1> </div> <div class="container"> <div class="one col-1-m col-1 menu"> <ul data-toc="Activities"> <li>Interactive</li> <li>Film</li> <li>Music</li> <li>Comedy</li> </ul> </div> <div class="two col-3-m col-3 "> <h1>South by Southwest</h1> <p>The South by Southwest&reg; (SXSW&reg;) Conference &amp; Festivals in Austin (TX) celebrate the convergence of the interactive, film, and music industries. Fostering creative and professional growth alike, SXSW&reg; is the premier destination for discovery.</p> </div> <div class="three col-4-m col-1"> <div data-toc="South by Southwest Conference" class="aside"> <h2>Conference</h2> <p>The SXSW conference is an amazing conglomeration of people, places, brands, music, tech and film.</p> <h2>Festivals</h2> <p>The different SXSW festivals take place for 10 days in mid-March Every year, with SXSW Interactive lasting for 5 days, Music for 6 days, and Film &amp; Comedy running concurrently for 9 days.</p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <h2>Exhibitions</h2> <p>Trade Show, Flatstock, SXSW Marketplace, Job Market, SXSW Create, Gaming Expo, Southbites Trailer Park, Spotlights.</p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> </div> </div> </div> <div class="container"> <h2 class="header">SXSW Conference and festivals</h2> <p data-toc="SXSW Conference and festivals"> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <h2 class="header">Exhibitions</h2> <div class="container"> <div class="col-2-m col-3"> <div><span data-toc="Exhibitions One" class=exhibition>SXSW Trade Show:</span> The all-encompassing exhibition for creative industries.</div> <div><span data-toc="Exhibitions Two" class=exhibition>Flatstock:</span> The world's top gig posters and artists.</div> <div><span data-toc="Exhibitions Three" class=exhibition>SXSW Marketplace:</span> One big pop-up shop at the center of SXSW.</div> <div><span class=exhibition>Job Market:</span> Where innovative talent and companies connect.</div> </div> <div class="col-2-m col-3"> <div><span data-toc="Exhibitions Four" class=exhibition>SXSW Create:</span> Hands on creative fun for imaginations of all ages.</div> <div><span data-toc="Exhibitions Six"class=exhibition>Gaming Expo:</span> The hub of gaming culture at SXSW.</div> <div><span class=exhibition>Southbites Trailer Park:</span> Featuring some of the best food trucks around.</div> <div><span class=exhibition>Spotlights:</span> Highlighting one-one-one connections with promising startups.</div> </div> </div> </div> </body> </html>

I hope this is a good question and thank you in advance for any help

My answer is basically the same as Alexey's, except I'll try to simplify it a bit.

Most of this will use the documentation for target-counter, which is located here: https://www.oxygenxml.com/doc/versions/23.1/ug-editor/topics/dg-target-counter-function.html

Follow this basic structure. In the body of your document, add an id to anything you'd like to link to. In your table of contents, add an href for each id:

<html>
  <body>

    <div class="toc-container">
      <div class="toc-row"><a href="#one">Link To Page One</a></div>
      <div class="toc-row"><a href="#two">Link To Page Two</a></div>
    </div>

    <div class="content-container">
      <div class="page" id="one">Page One Content Here</div>
      <div class="page" id="two">Page Two Content Here</div>
    </div>

  </body>
</html>

The css to implement the table of contents is incredibly simple then:

.toc-row a::after {
  content: leader('.') target-counter(attr(href), page, decimal);
}

That's it. The target-counter uses the href to grab the page number and puts that page number in your table of contents. You can style it however you like, but the basic styling ends up looking something like this:

目录示例

My answer is in Java but you can easily convert it to .NET because Jsoup version I will be using is embedded into iText, and the rest of the conversion is basically changing method names to start from an uppercase letter.

It's possible now to generate a table of contents with pure pdfHTML 3.0.3+ , but this will naturally require some preprocessing of your HTML file.

The idea is that we will loop over the elements that are marked with data-toc and create corresponding Table of Contents elements. The most interesting part is generating page numbers that indicate pages where the content we refer to is placed. We are going to use target-counter CSS function for that, and we are also going to mark all the elements with data-toc attributes with a unique ID to be able to refer to them both in the context of target-counter and also jump to those elements in pure links.

Here is an example code with some helper comments:

Document htmlDoc = Jsoup.parse(new File("path/to/in.html"), "UTF-8");

// This is our Table of Contents aggregating element
Element tocElement = htmlDoc.body().prependElement("div");
tocElement.append("<b>Table of contents</b>");

// We are going to build a complex CSS
StringBuilder tocStyles = new StringBuilder().append("<style>");

Elements tocElements = htmlDoc.select("[data-toc]");
for (Element elem : tocElements) {
    // Here we create an anchor to be able to refer to this element when generating page numbers and links
    String id = UUID.randomUUID().toString();
    elem.attr("id", id);

    // CSS selector to show page numebr for a TOC entry
    tocStyles.append("*[data-toc-id=\"" + id + "\"] .toc-page-ref::after { content: target-counter(#" + id + ", page) }");

    // Generating TOC entry as a small table to align page numbers on the right
    Element tocEntry = tocElement.appendElement("table");
    tocEntry.attr("style", "width: 100%");
    Element tocEntryRow = tocEntry.appendElement("tr");
    tocEntryRow.attr("data-toc-id", id);
    Element tocEntryTitle = tocEntryRow.appendElement("td");
    tocEntryTitle.appendText(elem.attr("data-toc"));
    Element tocEntryPageRef = tocEntryRow.appendElement("td");
    tocEntryPageRef.attr("style", "text-align: right");
    // <span> is a placeholder element where target page number will be inserted
    // It is wrapped by an <a> tag to create links pointing to the element in our document
    tocEntryPageRef.append("<a href=\"#" + id + "\"><span class=\"toc-page-ref\"></span></a>");
}


tocStyles.append("</style>");

htmlDoc.head().append(tocStyles.toString());

String html = htmlDoc.outerHtml();

HtmlConverter.convertToPdf(html, new FileOutputStream("path/to/out.pdf"));

Visual representation of the result I got using the above code and your example file:

结果

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM