Extraction Quick Start

Quickest way to get started using the client libraries.

Installation Instructions

Requires Python ≥ 3.9. Python ≥ 3.11 is recommended.

Simply install the PyPi package using pip:

pip install -U mindee~=4.36

Don't see support for your favorite language or framework? Make a feature request!

Send a File and Poll

Make a note of your model's ID for use in the API.

When getting started, we recommend using the polling method which will be quickest (unless you happen to already have access to a public-facing Web server).

Here are basic code examples, these are self-contained and can be run as-is:

Requires Python ≥ 3.9. Python ≥ 3.11 is recommended. Requires the Mindee Python client library version 4.35.1 or greater.

from mindee import (
    ClientV2,
    InferenceParameters,
    InferenceResponse,
    PathInput,
)

input_path = "/path/to/the/file.ext"
api_key = "MY_API_KEY"
model_id = "MY_MODEL_ID"

# Init a new client
mindee_client = ClientV2(api_key)

# Set inference parameters
model_params = InferenceParameters(
    # ID of the model, required.
    model_id=model_id,

    # Options: set to `True` or `False` to override defaults

    # Enhance extraction accuracy with Retrieval-Augmented Generation.
    rag=None,
    # Extract the full text content from the document as strings.
    raw_text=None,
    # Calculate bounding box polygons for all fields.
    polygon=None,
    # Boost the precision and accuracy of all extractions.
    # Calculate confidence scores for all fields.
    confidence=None,
)

# Load a file from disk
input_source = PathInput(input_path)

# Send for processing
response = mindee_client.enqueue_and_get_result(
    InferenceResponse,
    input_source,
    model_params,
)

# Print a brief summary of the parsed data
print(response.inference)

# Access the result fields
fields: dict = response.inference.result.fields

Also take a look at the Extraction Result documentation.

Details on Sending

For details on available options and advanced usage, check the following sections:

Process Extraction Results

Once you've sent the file and retrieved the response, you can start accessing the results.

The Extraction model's fields will be in the fields object in the return (the response variable returned from the above step).

Each key in the fields object corresponds to the field's name in your Data Schema.

You'll want to adapt your processing depending on the type of field, for example when looping over lists or accessing sub-fields.

Accessing simple values, using the name of the field in the Data Schema.

You can (should!) specify the type of value, the possible types are str , bool , float . Note that all types may be None.

Accessing a list of simple values, where my_list_field is the name of the field in the Model.

Accessing an object field and its sub-fields, where my_object_field is the name of the field in the Model. In this hypothetical case, the object has a sub-field named subfield_1 .

Accessing a list of objects, where my_object_list_field is the name of the field in the Model.

Details on Response Processing

For more details on using the result fields in your application: Extraction Result

For details on response metadata: Response Processing

PreviousSDK Integration NextExtraction Configuration

Last updated 25 days ago

Was this helpful?