When extracting Arabic text from a PDF file using librairies like PyMuPDF or PDFMiner, the words are returned in backward order which is a normal beha ...
When extracting Arabic text from a PDF file using librairies like PyMuPDF or PDFMiner, the words are returned in backward order which is a normal beha ...
I am trying to extract tables from a pdf files, after trying with multiple different packages, tabula is the best one to extract the tables from my pd ...
I have some pdfs with data about machine parts and i am trying to extract sizes. I extracted the text from a pdf via pypdfium2. Most of the text is ...
I am trying to extract hyperlink present in each page with their anchor text from pdf using PymuPdf library. I am able to extract hyperlinks with thei ...
Part 3 of a previous post. The task: I am attempting to iterate over a series of URLs presented in excel and generate complete text files for each. ...
In .net using Adobe Extract API for pdf to text, I'm getting structured json information (zipped). How can I get the normal text file using this infor ...
I have PDF files in same folder. How to get all PDF file names and save as excel file according to PDF file name. This is what I have tried ...
I am trying to extract a table like this into a Dataframe. How to do that (and extract even the names splitted on several lines) with Python? Also, I ...
In my project I need to read a PDF document. This pdf contains ukrainian & russian characters. the PDFReader read all characters in this pdf but t ...
Here i share my code main.py Result :Abdul Moeez :E-mail- amoeez14@gmail.com : Phone +1111111111 : Address Karachi, Sindh, Pakistan Ho ...
I am using Document Understanding in UiPath to extract data from multiple pdf's. Each pdf file contains multiple copies of the same page which I canno ...
. Answers to this question are eligible for a +100 reputation bounty. c ...
I am trying to extract comments from a PDF using Python. These are the two pieces of code that I have tested: One using PyPDF2: and the other usin ...
Im using Camelot to extract table information from a PDF that i have converted from scanned to searchable using ocrmypdf(500dpi). Camelot seems to be ...
I am trying to extract Hindi text from a PDF. I tried all the methods to exract from the PDF, but none of them worked. There are explanations why it d ...
I am trying to get fill color of paths using itext7 using fillclr= pathrenderinfo.getfillcolor.getcolorvalue() but it gives the value in format of dev ...
To more accurately extract table-like data embedded within table cells, I would like to be able to identify table cell boundaries in PDFs like this: ...
I'm trying to extract the text from the pdf url. If I download the PDF I can easily extract the text with the function slate. However, when trying to ...
I've got this pdf file. Image based low resolution pdf file. I'm trying to extract the data in it and all options I've tried seem not to work. Option ...
I have an image (attached) and want to extract certain fields from the form. For example the name 'Sarah', her email address etc. I have the region of ...