I want to convert pdf files into xml. Is there any java library available that can be used for this?
You can fetch xml representation of any PDF document as below using Apache Tika library
InputStream stream = new FileInputStream("sample.pdf");
ContentHandler handler = new ToXMLContentHandler();
Metadata metadata = new Metadata();
AutoDetectParser parser = new AutoDetectParser();
System.out.println(parser.parse(stream, handler, metadata));
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.