Azure Form Recognizer only extracts table when specifying page on which the table is located

Question

I have a four page PDF file and page 3 contains a table I want to extract:

with open(f"{data_dir}/{file_name}", "rb") as fd:
    document = fd.read()

    poller = document_analysis_client.begin_analyze_document("prebuilt-layout", document)
    result = poller.result()
    print(result)

Running this, it does not find any tables in the document.

However when I run exactly the same only adding page="3" or page="2-" as an argument to begin_analyze_document , it works perfectly!

    document = fd.read()

    poller = document_analysis_client.begin_analyze_document("prebuilt-layout", document, pages="3")
    result = poller.result()
    print(result)

What is going on here?

Answer 1

Actually you have to specify the page range in the parameters of the method begin_analyze_document() you're calling. https://learn.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.documentanalysisclient?view=azure-python#azure-ai-formrecognizer-documentanalysisclient-begin-analyze-document

Azure Form Recognizer only extracts table when specifying page on which the table is located

Question

1 answers

solution1
0 2023-01-25 10:17:23

Azure Form Recognizer only extracts table when specifying page on which the table is located

Question

1 answers

solution1 0 2023-01-25 10:17:23

solution1
0 2023-01-25 10:17:23