简体   繁体   中英

Reading .docx and .pdf file using pl sql

I want to read .docx and .pdf files stored in local disk using PL/SQL. And I want to extract some of data like Name,contact,email address from the .docx or .pdf files.

All this using PL/SQL.

Any help will be appreciated.

Oracle has a product which handles free text, Oracle Text. This can deal with common binary formats: you should be alright with Word and PDF. Find out more .

Text supports searching of documents with different index types for various use cases. However, like normal indexes they are really suited to equality searches. That is we can search a document for a specific email like this:

select * from t23
where contains(col_t, 'muhammad.hannan@example.com') > 0
/

But it's not very helpful when it comes to extracting all email addresses from a document. That's why we Nature gave us tools for defining structured documents (XML, JSON). So how well Text will support your actual use case depends on the details which you haven't posted.


Your question says 'local files'. Oracle Text will work with BFILEs , that is externally stored files. Define the table column with the BFILE datatype. Find out more.

However, BFILEs must be held in OS directories on the database server (ie local to the database not your PC), which are subject to the expected security permissions. Find out about creating Directories here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM