Layout Analysis (Asynchronous)

How it works

Users can submit an inference request for an image or a PDF document with up to 1000 pages using the asynchronous Inference Request API in Layout Analysis. Upon receiving the request, the API immediately returns a Request ID. After the input file is transferred and validated, it is divided into batches of 10 pages each, and inference is performed on each batch. Users can check the status of each batch in real-time using the Inference Response API. The inference results of each batch can be downloaded using the provided download_url in the response. The results will be available for downloading for 30 days from the time of the request.

Available models

Asynchronous API employs the same model as the synchronous API, becoming available from layout-analysis-0.4.0.

Inference Request

POST https://api.upstage.ai/v1/document-ai/async/layout-analysis

Parameters

Request headers

Authorization string Required
Authentication token, format: Bearer API_KEY

Request body

document file Required
The document file to be processed. Supported file formats are listed here.

ocr boolean Optional
A boolean value indicating whether to perform OCR inference on the document before layout detection. The default is false. When this option is set to false for PDF documents, the API directly extracts text and coordinates from the document without converting it to images. Otherwise, the API converts the input file to images and performs OCR inference before layout detection.

Requirements

Supported file formats: JPEG, PNG, BMP, PDF, TIFF, HEIC
Maximum file size: 50MB
Maximum number of pages per file: 1000 pages (For files exceeding 1000 pages, the API will return error)
Maximum pixels per page: 100,000,000 pixels. For non-image files, the pixel count is determined after converting to images at a standard of 150 DPI.
Supported character sets: Alphanumeric, Hangul, and Hanja are supported. Hanzi and Kanji are in beta versions, which means they are available but not fully supported.

ⓘ

Hanja, Hanzi, and Kanji are writing systems based on Chinese characters used in Korean, Chinese, and Japanese writing systems. Despite sharing similarities, they possess distinct visual representations, pronunciations, meanings, and usage conventions within their respective linguistic contexts. For more information, see this article.

ⓘ

For best results, follow these guidelines:

Use high-resolution documents to ensure legibility of text.

Ensure a minimum document width of 640 pixels.

The performance of the model might vary depending on the text size. Ensure that the smallest text in the image is at least 2.5% of the image's height. For example, if the image is 640 pixels tall, the smallest text should be at least 16 pixels tall.

Returns

Request ID for the request.

{
    "request_id": "e7b1b3b0-1b3b-4b3b-8b3b-1b3b3b3b3b3b"
}

Inference Response

GET https://api.upstage.ai/v1/document-ai/requests/{request_id}

Parameters

Request headers

Authorization string Required
Authentication token, format: Bearer API_KEY

Returns

id string
Unique identifier for the request.

status string
Current status of the request. Possible values are submitted, started, completed, and failed.

model string
A string representing the version of the model being used.

failure_message string
A message indicating the reason for failure, if any.

total_pages integer
Total number of pages in the document.

completed_pages integer
Number of pages processed.

batches list
A list of batches containing the inference results and detail for each batch.

batches[].id string
Unique identifier for the batch.

batches[].model string
A string representing the version of the model being used.

batches[].status string
Status of the batch. Possible values are submitted, started, completed, failed, and retrying.

batches[].failure_message string
A message indicating the reason for failure, if any.

batches[].download_url string
A presigned URL to download the inference results of the batch. The schema of the file is described in the Return Values section.

batches[].start_page integer
Page number where the batch starts. Starts from 1 for the first batch.

batches[].end_page integer
Page number where the batch ends.

batches[].requested_at date
The date and time when the batch was requested, formatted according to RFC3339.

batches[].updated_at date
The date and time when the status of the batch was last updated, formatted according to RFC3339.

requested_at date
The date and time when the request was submitted, formatted according to RFC3339.

updated_at date
The date and time when the status of the request was last updated, formatted according to RFC3339.

Inference History

GET https://api.upstage.ai/v1/document-ai/requests

Parameters

Request headers

Authorization string Required
Authentication token, format: Bearer API_KEY

Returns

id string
Unique identifier for the request.

status string
Current status of the request. Possible values are scheduled, started, completed, and failed.

model string
A string representing the version of the model being used.

requested_at date
The date and time when the request was submitted, formatted according to ISO 8601 standards.

updated_at date
The date and time when the status of the request was last updated, formatted according to ISO 8601 standards.

Example

Inference Request

Request

curl -X POST https://api.upstage.ai/v1/document-ai/async/layout-analysis \
-H "Authorization: Bearer UPSTAGE_API_KEY" \
-F "document=@invoice.png" \
-F "ocr=true"

Returns

{
    "request_id": "e7b1b3b0-1b3b-4b3b-8b3b-1b3b3b3b3b3b"
}

Inference Response

Request

curl https://api.upstage.ai/v1/document-ai/requests/REQUEST_ID \
-H "Authorization: Bearer UPSTAGE_API_KEY"

Returns

{
    "id": "e7b1b3b0-1b3b-4b3b-8b3b-1b3b3b3b3b3b",
    "status": "completed",
    "model": "layout-analysis",
    "failure_message": "",
    "total_pages": 28,
    "completed_pages": 28,
    "batches": [
        {
            "id": 0,
            "model": "layout-analysis-0.3.1",
            "status": "completed",
            "failure_message": "",
            "download_url": "https://download-url",
            "start_page": 1,
            "end_page": 10,
            "requested_at": "2024-07-01T14:47:01.863880448Z",
            "updated_at": "2024-07-01T14:47:15.901662097Z"
        },
        {
            "id": 1,
            "model": "layout-analysis-0.3.1",
            "status": "completed",
            "failure_message": "",
            "download_url": "https://download-url",
            "start_page": 11,
            "end_page": 20,
            "requested_at": "2024-07-01T14:47:01.863880448Z",
            "updated_at": "2024-07-01T14:47:13.59782266Z"
        },
        {
            "id": 2,
            "model": "layout-analysis-0.3.1",
            "status": "completed",
            "failure_message": "",
            "download_url": "https://download-url",
            "start_page": 21,
            "end_page": 28,
            "requested_at": "2024-07-01T14:47:01.863880448Z",
            "updated_at": "2024-07-01T14:47:37.061016766Z"
        }
    ],
    "requested_at": "2024-07-01T14:47:01.863880448Z",
    "completed_at": "2024-07-01T14:47:43.023542045Z"
}

Inference History

Request

curl https://api.upstage.ai/v1/document-ai/requests \
-H "Authorization: Bearer UPSTAGE_API_KEY"

Returns

{
    "requests": [
        {
            "id": "e7b1b3b0-1b3b-4b3b-8b3b-1b3b3b3b3b3b",
            "status": "completed",
            "model": "layout-analysis",
            "requested_at": "2024-07-01T14:47:01.863880448Z",
            "completed_at": "2024-07-01T14:47:43.023542045Z"
        }
    ]
}

Layout Analysis Key Information Extraction