繁体   English   中英

HTML 到 PDF 动态添加目录 (TOC)

[英]HTML to PDF adding a table of contents (TOC) dynamically

我正在使用 iText 7 将 html 转换为 PDF,但我什至找不到关于如何将目录添加到最终 PDF 的示例。 在此处遵循 iText 示例时,将文本转换为 PDF 并添加 TOC 是一项简单的任务,但显然将 HTML 转换为 PDF 时是不可能的。

要求很简单,只要找到所有的 HTML 标签属性“data-toc”并使用该值创建 TOC

这是我到目前为止使用的文本到 PDF 示例中使用的相同方法得到的结果

  • VS 2019

程序.cs

using System.IO;

namespace ConsoleApp
{
    class Program
    {
        private const string SRC = @"C:\iText\sxsw.html";
        private const string DEST = @"C:\iText\sxsw.pdf";

        static void Main(string[] args)
        {
            FileInfo file = new FileInfo(DEST);

            file.Directory.Create();

            new HtmlAndToc().CreatePdf(SRC, DEST);
        }
    }
}

HtmlAndTOC.cs

using iText.Html2pdf;
using iText.Html2pdf.Attach;
using iText.Html2pdf.Attach.Impl;
using iText.Html2pdf.Resolver.Font;
using iText.Kernel.Pdf;
using iText.StyledXmlParser.Node;
using System;
using System.IO;

namespace ConsoleApp
{
    class HtmlAndToc
    {
        public void CreatePdf(string src, string dest)
        {
            ConverterProperties properties = new ConverterProperties();
            properties.SetFontProvider(new DefaultFontProvider(true, true, true));

            // Custom Factory
            properties.SetTagWorkerFactory(new CustomTagWorkerFactory());

            PdfWriter writer = new PdfWriter(dest);
            PdfDocument pdfDoc = new PdfDocument(writer);
            pdfDoc.SetTagged();

            FileStream htmlStream = new FileStream(src, FileMode.Open);

            HtmlConverter.ConvertToPdf(htmlStream, pdfDoc, properties);

            htmlStream.Close();
            pdfDoc.Close();
        }
    }

    public class CustomTagWorkerFactory : DefaultTagWorkerFactory
    {
        public override ITagWorker GetCustomTagWorker(IElementNode tag, ProcessorContext context)
        {
            string dataTocAttrValue = tag.GetAttribute("data-toc");

            if (String.IsNullOrWhiteSpace(dataTocAttrValue))
            {
                return null;
            }

            // At this point I have the HTML element and the data-toc attribute value that is a TOC title to be associated to a page number
            // Because the data-toc attribute can be on any HTML element type I couldn't find a way to create an Outline/AddDestination nor 
            // a way to SetNextRenderer to keep the page number updated as the Text to PDF example does in
            // https://itextpdf.com/en/resources/examples/itext-7/toc-first-page

            Console.WriteLine(dataTocAttrValue);

            return null;
        }
    }
}

这里有一个示例 HTML 文件:

 <!DOCTYPE html> <html> <head> <!-- example from: https://itextpdf.com/ --> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="stylesheet" type="text/css" href="css/sxsw.css"/> <link rel="stylesheet" media="print only" href="css/sxsw_print.css"> <!-- Based on the reponsive design example found at https://www.w3schools.com/css/css_rwd_intro.asp --> </head> <body> <div class="header"> <h1>SXSW</h1> </div> <div class="container"> <div class="one col-1-m col-1 menu"> <ul data-toc="Activities"> <li>Interactive</li> <li>Film</li> <li>Music</li> <li>Comedy</li> </ul> </div> <div class="two col-3-m col-3 "> <h1>South by Southwest</h1> <p>The South by Southwest&reg; (SXSW&reg;) Conference &amp; Festivals in Austin (TX) celebrate the convergence of the interactive, film, and music industries. Fostering creative and professional growth alike, SXSW&reg; is the premier destination for discovery.</p> </div> <div class="three col-4-m col-1"> <div data-toc="South by Southwest Conference" class="aside"> <h2>Conference</h2> <p>The SXSW conference is an amazing conglomeration of people, places, brands, music, tech and film.</p> <h2>Festivals</h2> <p>The different SXSW festivals take place for 10 days in mid-March Every year, with SXSW Interactive lasting for 5 days, Music for 6 days, and Film &amp; Comedy running concurrently for 9 days.</p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <h2>Exhibitions</h2> <p>Trade Show, Flatstock, SXSW Marketplace, Job Market, SXSW Create, Gaming Expo, Southbites Trailer Park, Spotlights.</p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> </div> </div> </div> <div class="container"> <h2 class="header">SXSW Conference and festivals</h2> <p data-toc="SXSW Conference and festivals"> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <p> The SXSW Conference features 25 different tracks that prove the most unexpected discoveries happen when diverse topics and people come together. In 2017, SXSW equipped 421,900 people from 95 countries. There were 2,887 sessions by 5,280 speakers for 70,969 attendees. In addition to its conference, SXSW is known for its four festivals: interactive, film, music, and comedy. </p> <p> The evening networking events that make up the SXSW Interactive Festival range from parties, award presentations, private events, and more. In 2017, there were 1,084 official parties and events. </p> <p> The SXSW Film Festival has become known for the high caliber and diversity of films presented, and for its smart, enthusiastic audiences. Running the length of SXSWeek, Film festival attendees can connect with tech and music industry experts for an unparalleled experience at the forefront of discovery, creativity, and innovation. SXSW 2017 brought 171 Shorts, 130 Features, of which 83 World Premieres for 70,547 Attendees. </p> <p> The SXSW Music Festival is the most influential music industry event in the world. Record labels, booking agencies, management and PR firms, export offices, publishers, media outlets, tastemakers, and lifestyle brands converge to see thousands of artists perform in venues all throughout downtown Austin. In 2017, there were 167,800 attendees for 2,085 Performing Acts, of which 544 International acts from 63 different countries. </p> <p> Growing from a singular night's celebration of comedy's biggest names into a week-long whirlwind of a festival, the SXSW Comedy Festival presents uniquely diverse programming that highlights exceptional emerging and established talent. </p> <h2 class="header">Exhibitions</h2> <div class="container"> <div class="col-2-m col-3"> <div><span data-toc="Exhibitions One" class=exhibition>SXSW Trade Show:</span> The all-encompassing exhibition for creative industries.</div> <div><span data-toc="Exhibitions Two" class=exhibition>Flatstock:</span> The world's top gig posters and artists.</div> <div><span data-toc="Exhibitions Three" class=exhibition>SXSW Marketplace:</span> One big pop-up shop at the center of SXSW.</div> <div><span class=exhibition>Job Market:</span> Where innovative talent and companies connect.</div> </div> <div class="col-2-m col-3"> <div><span data-toc="Exhibitions Four" class=exhibition>SXSW Create:</span> Hands on creative fun for imaginations of all ages.</div> <div><span data-toc="Exhibitions Six"class=exhibition>Gaming Expo:</span> The hub of gaming culture at SXSW.</div> <div><span class=exhibition>Southbites Trailer Park:</span> Featuring some of the best food trucks around.</div> <div><span class=exhibition>Spotlights:</span> Highlighting one-one-one connections with promising startups.</div> </div> </div> </div> </body> </html>

我希望这是一个好问题,并提前感谢您的帮助

我的答案与 Alexey 的答案基本相同,只是我会尝试简化一下。

其中大部分将使用目标计数器的文档,该文档位于此处: https : //www.oxygenxml.com/doc/versions/23.1/ug-editor/topics/dg-target-counter-function.html

遵循这个基本结构。 在您的文档正文中,为您想要链接的任何内容添加一个 id。 在您的目录中,为每个 id 添加一个 href:

<html>
  <body>

    <div class="toc-container">
      <div class="toc-row"><a href="#one">Link To Page One</a></div>
      <div class="toc-row"><a href="#two">Link To Page Two</a></div>
    </div>

    <div class="content-container">
      <div class="page" id="one">Page One Content Here</div>
      <div class="page" id="two">Page Two Content Here</div>
    </div>

  </body>
</html>

实现目录的 css 非常简单:

.toc-row a::after {
  content: leader('.') target-counter(attr(href), page, decimal);
}

就是这样。 目标计数器使用 href 获取页码并将该页码放入您的目录中。 您可以随意设置样式,但基本样式最终看起来像这样:

目录示例

我的答案是在 Java 中,但您可以轻松地将其转换为 .NET,因为我将使用的Jsoup版本已嵌入到 iText 中,其余的转换基本上是将方法名称更改为从大写字母开始。

现在可以使用纯pdfHTML 3.0.3+生成目录,但这自然需要对 HTML 文件进行一些预处理。

这个想法是我们将循环使用data-toc标记的元素并创建相应的目录元素。 最有趣的部分是生成页码,指示我们引用的内容所在的页面。 为此,我们将使用target-counter CSS 函数,并且我们还将使用具有唯一 ID 的data-toc属性标记所有元素,以便能够在target-counter的上下文中引用它们,也可以跳转到纯链接中的那些元素。

这是一个带有一些帮助注释的示例代码:

Document htmlDoc = Jsoup.parse(new File("path/to/in.html"), "UTF-8");

// This is our Table of Contents aggregating element
Element tocElement = htmlDoc.body().prependElement("div");
tocElement.append("<b>Table of contents</b>");

// We are going to build a complex CSS
StringBuilder tocStyles = new StringBuilder().append("<style>");

Elements tocElements = htmlDoc.select("[data-toc]");
for (Element elem : tocElements) {
    // Here we create an anchor to be able to refer to this element when generating page numbers and links
    String id = UUID.randomUUID().toString();
    elem.attr("id", id);

    // CSS selector to show page numebr for a TOC entry
    tocStyles.append("*[data-toc-id=\"" + id + "\"] .toc-page-ref::after { content: target-counter(#" + id + ", page) }");

    // Generating TOC entry as a small table to align page numbers on the right
    Element tocEntry = tocElement.appendElement("table");
    tocEntry.attr("style", "width: 100%");
    Element tocEntryRow = tocEntry.appendElement("tr");
    tocEntryRow.attr("data-toc-id", id);
    Element tocEntryTitle = tocEntryRow.appendElement("td");
    tocEntryTitle.appendText(elem.attr("data-toc"));
    Element tocEntryPageRef = tocEntryRow.appendElement("td");
    tocEntryPageRef.attr("style", "text-align: right");
    // <span> is a placeholder element where target page number will be inserted
    // It is wrapped by an <a> tag to create links pointing to the element in our document
    tocEntryPageRef.append("<a href=\"#" + id + "\"><span class=\"toc-page-ref\"></span></a>");
}


tocStyles.append("</style>");

htmlDoc.head().append(tocStyles.toString());

String html = htmlDoc.outerHtml();

HtmlConverter.convertToPdf(html, new FileOutputStream("path/to/out.pdf"));

我使用上面的代码和您的示例文件得到的结果的可视化表示:

结果

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM