简体   繁体   中英

Convert PDF to HTML version 3.2 with images and html file in a folder

I hope you are doing well.

I need to know about a PHP library that converts a PDF file having images as well to be converted in a HTML file with the following features that the library can do.

  1. HTML file needs to be of version 3.2 compatible
  2. Save the images in PDF file having .jpg extension
  3. Correct font from PDF needs to be used in the HTML file.
  4. A result folder that contains the images and html file in one folder

I have tried most of the PHP libraries but most of the PHP libraries are NOT doing my needed tasks.

Please, help let me know about a library that do all the above 4 requirements (image attached for reference)

在此处输入图片说明

Waiting for your kind responses.

Thanks

I am not very sure, But here is a library in PHP I found. Here

Try this:

http://www.pdfaid.com/pdf-to-html.aspx

Or this: http://webdesign.about.com/od/pdf/tp/tools-for-converting-pdf-to-html.htm

Or this... http://www.pdfconvertonline.com/pdf-to-html-online.html

There are plenty of options available to you, the secret is to use a new fangled thing called a Search Engine, such as a Bing or a Google.

you will also do well to research on Stack Overflow before asking your question:

1) HTML 3.2 wes superceeded in 1997, this is very nearly twenty years ago, why on eart are you still needing a comparatively ancient technology when there are far better improvements available such as XML HTML, HTML 4.01 and HTML5.

2) Please read How can I extract embedded fonts from a PDF as valid font files?

3) Also to extract images you can use: http://www.makeuseof.com/tag/extract-images-pdf-files-save-windows/ but again, there are several options available to you if you care to look for them.

You seem to imply a fundamental misunderstanding about HTML; there are several different ways of getting any desired result with HTML. You have a PDF file and you want it to look a certain way, this look depends on the browser you are looking at it on . For example if you use a PDF to HTML converter as linked above you will very probably find that the output will look different on Internet Explorer 7 versus on Firefox versus Internet Explorer 10. There is no one way of writing output on HTML or with CSS.

If you want a custom built library to do your specific task then you will need to employ a professional to do it, or you will need to code it yourself. This obviously should be charged to the client for requiring a technology that is extremely outdated. You can probably search github for a similar library (the one linked by CK Khan looks like what you're after) and then fork it and make your own variation for your needs. I very much doubt anyone is going to put time into developing a system to output HTML 3.2 from a PDF, and even less likely to develop this system for free and to your exact specifications.

It also appears that you can not directly incorporate font families into the <font> tag in HTML 3.2, only being able to edit size and colour of fonts. You can use CSS1 font-family to show font families. See here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM