简体   繁体   中英

How can I add an interactive “table of contents” to a scanned pdf?

I'm trying to go from a paper document to a searchable pdf with a table of contents.

Sometimes you will download a pdf book or document, (like for example the Intel Manual which can be seen below) This document is searchable and it also has a table of contents. Now, when you put this same document on Google Drive and then open it up with PDF Expert on an ipad, it is still searchable with a table of contents. This is what I'd like to do with all my scanned pdfs. 在此处输入图片说明在此处输入图片说明

Now a more concrete example. Shown below is a document that I've scanned with the Fujitsu ScanSnap. It's also searchable thanks to some software that comes with the ScanSnap. So now I have a searchable pdf that can be opened up locally or on my ipad, but it doesn't have a table of contents. So my main question is: How can I add a table of contents like the one in for the Intel Manual to a scanned pdf 在此处输入图片说明在此处输入图片说明

It seems like there's tons of people doing different things with "table of contents". Like people who are designing documents use InDesign. I think that what I'm trying to do must be simpler than that. I'm thinking that there has to be an easy way to do this using say Adobe Acrobat Pro? Something about adding "bookmarks" or "links" or "tags" to the existing table of contents. Do you know of a clear and concise way to do this using acrobat or some other software?

thanks for the help

I have done this before by combining multiple "booklets". Each "Chapter" was a series of pages combined in Adobe Acrobat Pro. I would combine chapters into separate "booklets" and then name them a chapter name, and then combine all chapters into a new booklet.

Jpdfbookmark can work for scanned books

Watch tutorial video ≫

Step 1: Prepare the table of content

Save the TOC in a .txt file in this format:

Chapter 1. The Beginning/23
    Para 1.1 Child of The Beginning/25,FitWidth,96
        Para 1.1.1 Child of Child of The Beginning/26,FitHeight,43
Chapter 2. The Continue/30,TopLeft,120,42
    Para 2.1 Child of The Beginning/32,FitPage

You can ORC the TOC and use regex to fix it.

Step 2: Load that TOC

Step 3: Prepare for step 4

This sounds dumb, but if you miss it you will be frustrated and have to do it again. Expand all bookmarks ( Ctrl + E ), select all of them, then go to Tools → Apply Page Offset

Step 4: Apply page offset

This step should be self-explained. Don't forget to save.


That's it. You are done. For more information, you can read its its manual . The program has command line mode and can work on Linux, Mac.

If there are non-Roman characters, be sure to use the same encoding when dumping and applying bookmarks.

I also have a complete guide to process scanned books, you may want to check it out: The ultimate guide to process scanned books .


FYI:
How to OCR tables of contents to proper outputs?
How can I split in half a double-page scanned PDF in a single pass?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM