Base64-encoded text input.
base64_string (str)
filename (str)
Base class for parameters accepted by all V2 endpoints.
model_id (str)
alias (str | None)
webhook_ids (List[str] | None)
polling_options (PollingOptions | None)
close_file (bool)
Getter for the enqueue slug.
str
Return the parameters as a config dictionary.
Dict[str, Union[str, List[str]]]
A dict of parameters.
Optional[str] = NoneUse an alias to link the file to your own DB. If empty, no alias will be used.
bool = TrueWhether to close the file after product.
strID of the model, required.
Optional[PollingOptions] = NoneOptions for polling. Set only if having timeout issues.
Optional[List[str]] = NoneIDs of webhooks to propagate the API response to.
Raw bytes input.
raw_bytes (bytes)
filename (str)
A binary file input.
file (BinaryIO)
Inference parameters to set when sending a file.
model_id (str)
alias (str | None)
webhook_ids (List[str] | None)
polling_options (PollingOptions | None)
close_file (bool)
_slug (str)
rag (bool | None)
raw_text (bool | None)
polygon (bool | None)
confidence (bool | None)
text_context (str | None)
data_schema (DataSchema | str | dict | None)
Return the parameters as a config dictionary.
Dict[str, Union[str, List[str]]]
A dict of parameters.
Optional[bool] = NoneBoost the precision and accuracy of all extractions.
Calculate confidence scores for all fields, and fill their confidence attribute.
Union[DataSchema, str, dict, None] = NoneDynamic changes to the data schema of the model for this inference. Not recommended, for specific use only.
Optional[bool] = NoneCalculate bounding box polygons for all fields, and fill their locations attribute.
Optional[bool] = NoneEnhance extraction accuracy with Retrieval-Augmented Generation.
Optional[bool] = NoneExtract the full text content from the document as strings, and fill the raw_text attribute.
Optional[str] = NoneAdditional text context used by the model during inference. Not recommended, for specific use only.
The input type, for internal use.
Base class for all input sources coming from the local machine.
input_type (InputType)
Apply cut and merge options on multipage documents.
None
page_options (PageOptions)
Close the file object.
None
Compresses the file object, either as a PDF or an image.
quality (int, default: 85) – Quality of the compression. For images, this is the JPEG quality.
For PDFs, this affects image quality within the PDF.
max_width (Optional[int], default: None) – Maximum width for image resizing. Ignored for PDFs.
max_height (Optional[int], default: None) – Maximum height for image resizing. Ignored for PDFs.
force_source_text (bool, default: False) – For PDFs, whether to force compression even if source text is present.
disable_source_text (bool, default: True) – For PDFs, whether to disable source text during compression.
None
Deprecated. Use page_count instead.
int
Fix a potentially broken pdf file.
WARNING: this feature alters the data of the enqueued file by removing unnecessary headers.
Reads the bytes of a PDF file until a proper pdf tag is encountered, or until the maximum offset has been reached. If a tag denoting a PDF file is found, deletes all bytes before it.
maximum_offset (int, default: 500) – maximum byte offset where superfluous headers will be removed.
Cannot be less than 0.
None
If the file is a PDF, checks if it has source text.
bool
True if the file is a PDF and has source text. False otherwise.
bool
True if the file is a PDF.
Check if the PDF is empty.
bool
True if the PDF is empty
Create a new PDF from pages and set it to file_object.
page_numbers (set) – List of page numbers to use for merging in the original PDF.
None
None
Run any required processing on a PDF file.
None
behavior (str)
on_min_pages (int)
page_indexes (Sequence)
Read the contents of the input file.
close_file (bool) – whether to close the file after reading
Tuple[str, bytes]
a Tuple with the file name and binary data
strBinaryIOstrOptional[str]Count the pages in the document.
The number of pages.
Local response loaded from a file.
input_file (BinaryIO | str | Path | bytes)
Load a local inference.
Typically used when wanting to load a V2 webhook callback.
TypeVar(ResponseT, bound= CommonResponse)
response_class (Type[ResponseT])
Returns the hmac signature of the local response, from the secret key provided.
secret_key (Union[str, bytes, bytearray]) – Secret key, either a string or a byte/byte array.
The hmac signature of the local response.
Checks if the hmac signature of the local response is valid.
secret_key (Union[str, bytes, bytearray]) – Secret key, given as a string.
signature (str) – HMAC signature, given as a string.
True if the HMAC signature is valid.
Returns the dictionary representation of the file.
A json-like dictionary.
Options to pass to the parse method for cutting multipage documents.
page_indexes (Sequence[int])
operation (str)
on_min_pages (int)
intApply the operation only if document has at least this many pages.
Default: 0 (apply on all documents)
strOperation to apply on the document, given the page_indexes specified:
KEEP_ONLY - keep only the specified pages, and remove all others.
REMOVE - remove the specified pages, and keep all others.
Sequence[int]Zero-based list of page indexes. A negative index can be used, indicating an offset from the end of the document.
[0, -1] represents the fist and last pages of the document.
A local path input.
filepath (str | None)
Options for asynchronous polling.
initial_delay_sec (float)
delay_sec (float)
max_retries (int)
floatDelay between each polling attempt.
floatInitial delay before the first polling attempt.
intTotal number of polling attempts.
A local or distant URL input.
url (str)
Convert the URL content to a BytesInput object.
filename (Optional[str], default: None) – Optional filename for the BytesInput.
username (Optional[str], default: None) – Optional username for authentication.
password (Optional[str], default: None) – Optional password for authentication.
token (Optional[str], default: None) – Optional token for authentication.
headers (Optional[dict], default: None) – Optional additional headers for the request.
max_redirects (int, default: 3) – Maximum number of redirects to follow.
A BytesInput object containing the file content.
Save the content of the URL to a file.
filepath (Union[Path, str]) – Path to save the content to.
filename (Optional[str], default: None) – Optional filename to give to the file.
username (Optional[str], default: None) – Optional username for authentication.
password (Optional[str], default: None) – Optional password for authentication.
token (Optional[str], default: None) – Optional token for authentication.
headers (Optional[dict], default: None) – Optional additional headers for the request.
max_redirects (int, default: 3) – Maximum number of redirects to follow.
Path
The path to the saved file.
strThe Uniform Resource Locator.
Options to pass to a workflow execution.
alias (str | None)
priority (ExecutionPriority | None)
full_text (bool)
public_url (str | None)
rag (bool)
Optional[str]Alias for the document.
boolWhether to include the full OCR text response in compatible APIs.
Optional[ExecutionPriority]Priority of the document.
Optional[str]A unique, encrypted URL for accessing the document validation interface without requiring authentication.
boolWhether to enable Retrieval-Augmented Generation.