Documentation
APIs
Layout Analysis

Layout Analysis

Detect document elements from any document including tables and figures.

Available models

ModelAvailabilityRelease dateDescription
layout-analysis-0.3.0Latest2024-06-11 betaImprove inference speed by up to 30% for PDF documents.
layout-analysis-0.2.1Deprecated2024-05-02 betaRemoved unnecessary <thead> tags from table elements and fixed bugs.
layout-analyzer-0.2.0Deprecated2024-04-04 betaImproved the accuracy for table recognition and performance for layout detection.
layout-analyzer-0.1.0Deprecated2024-02-28 betaA layout analyzer model which detects elements within a document, recognizes tables, and serializes elements according to reading order.

Request

POST https://api.upstage.ai/v1/document-ai/layout-analysis

Parameters

Request headers

Authorization string Required
Authentication token, format: Bearer API_KEY

Request body

document file Required
The document file to be processed. Supported file formats are listed here.

ocr boolean Optional
A boolean value indicating whether to perform OCR inference on the document before layout detection. The default is true. When this option is set to false for PDF documents, the API directly extracts text and coordinates from the document without converting it to images. Otherwise, the API converts the PDF to images and performs OCR inference before layout detection.

Requirements

  • Supported file formats: JPEG, PNG, BMP, PDF, TIFF, HEIC
  • Maximum file size: 50MB
  • Maximum number of pages per file: 100 pages (For files exceeding 100 pages, the first 100 pages are processed)
  • Maximum pixels per page: 100,000,000 pixels. For non-image files, the pixel count is determined after converting to images at a standard of 150 DPI.
  • Supported character sets: Alphanumeric, Hangul, and Hanja are supported. Hanzi and Kanji are in beta versions, which means they are available but not fully supported.

Hanja, Hanzi, and Kanji are writing systems based on Chinese characters used in Korean, Chinese, and Japanese writing systems. Despite sharing similarities, they possess distinct visual representations, pronunciations, meanings, and usage conventions within their respective linguistic contexts. For more information, see this article.

For best results, follow these guidelines:

  • Use high-resolution documents to ensure legibility of text.
  • Ensure a minimum document width of 640 pixels.
  • The performance of the model might vary depending on the text size. Ensure that the smallest text in the image is at least 2.5% of the image's height. For example, if the image is 640 pixels tall, the smallest text should be at least 16 pixels tall.

Response

Functionality overview

  • Unknown characters: Characters are not recognized by the OCR model are considered "unknown characters" and are marked by the character .
  • Response time: Standard documents containing up to 200 words take approximately three seconds. Longer documents can take up to tens of seconds.
  • Timeout: There is a server-side 3 minutes timeout for all requests.

Return values

mimetype string
The MIME type of the input file (e.g., "application/pdf").

elements list
A list of element objects, each containing information about the elements's text, confidence score, and bounding box.

elements[].id integer
The unique identifier for an element.

elements[].page integer
The page number for an element. Starts from 1.

elements[].bounding_box list
A list with four (x, y) coordinates defining the corners of an element's bounding box.

elements[].category string
The category of a given element. Categories are in {paragraph, table, figure, header, footer, caption, equation}.

elements[].text string
A string representing the text within an element.

elements[].html string
A string representing the text within an element, in HTML format.

text string
A string representing the text of the entire document, typically created by concatenating the pages' serialized texts.

html string
A string representing the text of the entire document in HTML format, typically created by concatenating the pages' serialized HTML snippets.

model string
A string representing the version of the model being used.

api string
A string representing the version of the API being used. A bump in the major version indicates a backward-incompatible update, while a minor version increase signifies a backward-compatible update.

billed_pages integer
The total count of pages in the input file that have been processed and are chargeable.

metadata object
An object containing metadata about a document, such as page size and page numbers.

Examples

Request

invoice.png
invoice.png
curl -X POST https://api.upstage.ai/v1/document-ai/layout-analysis \
-H "Authorization: Bearer UPSTAGE_API_KEY" \
-F "document=@invoice.png" \
-F "ocr=true"

Response

{
    "api": "1.1",
    "billed_pages": 1,
    "elements": [
        {
            "bounding_box": [
                {
                    "x": 88,
                    "y": 67
                },
                {
                    "x": 311,
                    "y": 67
                },
                {
                    "x": 311,
                    "y": 117
                },
                {
                    "x": 88,
                    "y": 117
                }
            ],
            "category": "paragraph",
            "html": "<p id='0' style='font-size:22px'>INVOICE</p>",
            "id": 0,
            "page": 1,
            "text": "INVOICE"
        },
        {
            "bounding_box": [
                {
                    "x": 86,
                    "y": 336
                },
                {
                    "x": 208,
                    "y": 336
                },
                {
                    "x": 208,
                    "y": 404
                },
                {
                    "x": 86,
                    "y": 404
                }
            ],
            "category": "paragraph",
            "html": "<p id='1' style='font-size:20px'>Company<br>Upstage</p>",
            "id": 1,
            "page": 1,
            "text": "Company\nUpstage"
        },
        {
            "bounding_box": [
                {
                    "x": 750,
                    "y": 96
                },
                {
                    "x": 1213,
                    "y": 96
                },
                {
                    "x": 1213,
                    "y": 125
                },
                {
                    "x": 750,
                    "y": 125
                }
            ],
            "category": "paragraph",
            "html": "<br><p id='2' style='font-size:16px'>Invoice ID # INV-AJ355548</p>",
            "id": 2,
            "page": 1,
            "text": "Invoice ID # INV-AJ355548"
        },
        {
            "bounding_box": [
                {
                    "x": 86,
                    "y": 423
                },
                {
                    "x": 194,
                    "y": 423
                },
                {
                    "x": 194,
                    "y": 489
                },
                {
                    "x": 86,
                    "y": 489
                }
            ],
            "category": "paragraph",
            "html": "<p id='3' style='font-size:16px'>Name<br>Lucy Park</p>",
            "id": 3,
            "page": 1,
            "text": "Name\nLucy Park"
        },
        {
            "bounding_box": [
                {
                    "x": 89,
                    "y": 517
                },
                {
                    "x": 194,
                    "y": 517
                },
                {
                    "x": 194,
                    "y": 543
                },
                {
                    "x": 89,
                    "y": 543
                }
            ],
            "category": "paragraph",
            "html": "<p id='4' style='font-size:16px'>Address</p>",
            "id": 4,
            "page": 1,
            "text": "Address"
        },
        {
            "bounding_box": [
                {
                    "x": 86,
                    "y": 551
                },
                {
                    "x": 518,
                    "y": 551
                },
                {
                    "x": 518,
                    "y": 656
                },
                {
                    "x": 86,
                    "y": 656
                }
            ],
            "category": "paragraph",
            "html": "<br><p id='5' style='font-size:14px'>7 Pepper Wood Street, 130 Stone Corner<br>Terrace<br>Wilkes Barre, Pennsylvania, 18768<br>United States</p>",
            "id": 5,
            "page": 1,
            "text": "7 Pepper Wood Street, 130 Stone Corner\nTerrace\nWilkes Barre, Pennsylvania, 18768\nUnited States"
        },
        {
            "bounding_box": [
                {
                    "x": 86,
                    "y": 677
                },
                {
                    "x": 377,
                    "y": 677
                },
                {
                    "x": 377,
                    "y": 742
                },
                {
                    "x": 86,
                    "y": 742
                }
            ],
            "category": "paragraph",
            "html": "<br><p id='6' style='font-size:18px'>Email<br>Ikitchenman0@arizona.edu</p>",
            "id": 6,
            "page": 1,
            "text": "Email\nIkitchenman0@arizona.edu"
        },
        {
            "bounding_box": [
                {
                    "x": 753,
                    "y": 158
                },
                {
                    "x": 1213,
                    "y": 158
                },
                {
                    "x": 1213,
                    "y": 186
                },
                {
                    "x": 753,
                    "y": 186
                }
            ],
            "category": "paragraph",
            "html": "<br><p id='7' style='font-size:18px'>Invoice Date 9/7/1992</p>",
            "id": 7,
            "page": 1,
            "text": "Invoice Date 9/7/1992"
        },
        {
            "bounding_box": [
                {
                    "x": 750,
                    "y": 269
                },
                {
                    "x": 1065,
                    "y": 269
                },
                {
                    "x": 1065,
                    "y": 305
                },
                {
                    "x": 750,
                    "y": 305
                }
            ],
            "category": "paragraph",
            "html": "<p id='8' style='font-size:20px'>Service Details Form</p>",
            "id": 8,
            "page": 1,
            "text": "Service Details Form"
        },
        {
            "bounding_box": [
                {
                    "x": 749,
                    "y": 336
                },
                {
                    "x": 856,
                    "y": 336
                },
                {
                    "x": 856,
                    "y": 405
                },
                {
                    "x": 749,
                    "y": 405
                }
            ],
            "category": "paragraph",
            "html": "<p id='9' style='font-size:16px'>Name<br>Sung Kim</p>",
            "id": 9,
            "page": 1,
            "text": "Name\nSung Kim"
        },
        {
            "bounding_box": [
                {
                    "x": 749,
                    "y": 431
                },
                {
                    "x": 1162,
                    "y": 431
                },
                {
                    "x": 1162,
                    "y": 535
                },
                {
                    "x": 749,
                    "y": 535
                }
            ],
            "category": "paragraph",
            "html": "<p id='10' style='font-size:18px'>Address<br>Gwanggyojungang-ro 338, Gyeonggi-do,<br>Sanghyeon-dong, Suji-gu<br>Yongin-si, South Korea</p>",
            "id": 10,
            "page": 1,
            "text": "Address\nGwanggyojungang-ro 338, Gyeonggi-do,\nSanghyeon-dong, Suji-gu\nYongin-si, South Korea"
        },
        {
            "bounding_box": [
                {
                    "x": 87,
                    "y": 849
                },
                {
                    "x": 322,
                    "y": 849
                },
                {
                    "x": 322,
                    "y": 881
                },
                {
                    "x": 87,
                    "y": 881
                }
            ],
            "category": "paragraph",
            "html": "<p id='11' style='font-size:20px'>Additional Request</p>",
            "id": 11,
            "page": 1,
            "text": "Additional Request"
        },
        {
            "bounding_box": [
                {
                    "x": 550,
                    "y": 848
                },
                {
                    "x": 1196,
                    "y": 848
                },
                {
                    "x": 1196,
                    "y": 930
                },
                {
                    "x": 550,
                    "y": 930
                }
            ],
            "category": "paragraph",
            "html": "<br><p id='12' style='font-size:16px'>Vivamus vestibulum sagittis sapien. Cum sociis natoque<br>penatibus et magnis dis parturient montes, nascetur ridiculus<br>mus.</p>",
            "id": 12,
            "page": 1,
            "text": "Vivamus vestibulum sagittis sapien. Cum sociis natoque\npenatibus et magnis dis parturient montes, nascetur ridiculus\nmus."
        },
        {
            "bounding_box": [
                {
                    "x": 88,
                    "y": 1056
                },
                {
                    "x": 327,
                    "y": 1056
                },
                {
                    "x": 327,
                    "y": 1080
                },
                {
                    "x": 88,
                    "y": 1080
                }
            ],
            "category": "paragraph",
            "html": "<p id='13' style='font-size:14px'>TERMS AND CONDITIONS</p>",
            "id": 13,
            "page": 1,
            "text": "TERMS AND CONDITIONS"
        },
        {
            "bounding_box": [
                {
                    "x": 86,
                    "y": 1107
                },
                {
                    "x": 1207,
                    "y": 1107
                },
                {
                    "x": 1207,
                    "y": 1211
                },
                {
                    "x": 86,
                    "y": 1211
                }
            ],
            "category": "paragraph",
            "html": "<p id='14' style='font-size:14px'>1. The Seller shall not be liable to the Buyer directly or indirectly for any loss or damage suffered by the Buyer.<br>2. The Seller warrants the product for one (1) year from the date of shipment.<br>3. Any purchase order received by the seller will be interpreted as accepting this offer and the sale offer in writing. The buyer may<br>purchase the product in this offer only under the Terms and Conditions of the Seller included in this offer.</p>",
            "id": 14,
            "page": 1,
            "text": "1. The Seller shall not be liable to the Buyer directly or indirectly for any loss or damage suffered by the Buyer.\n2. The Seller warrants the product for one (1) year from the date of shipment.\n3. Any purchase order received by the seller will be interpreted as accepting this offer and the sale offer in writing. The buyer may\npurchase the product in this offer only under the Terms and Conditions of the Seller included in this offer."
        }
    ],
    "html": "<p id='0' style='font-size:22px'>INVOICE</p><p id='1' style='font-size:20px'>Company<br>Upstage</p><br><p id='2' style='font-size:16px'>Invoice ID # INV-AJ355548</p><p id='3' style='font-size:16px'>Name<br>Lucy Park</p><p id='4' style='font-size:16px'>Address</p><br><p id='5' style='font-size:14px'>7 Pepper Wood Street, 130 Stone Corner<br>Terrace<br>Wilkes Barre, Pennsylvania, 18768<br>United States</p><br><p id='6' style='font-size:18px'>Email<br>Ikitchenman0@arizona.edu</p><br><p id='7' style='font-size:18px'>Invoice Date 9/7/1992</p><p id='8' style='font-size:20px'>Service Details Form</p><p id='9' style='font-size:16px'>Name<br>Sung Kim</p><p id='10' style='font-size:18px'>Address<br>Gwanggyojungang-ro 338, Gyeonggi-do,<br>Sanghyeon-dong, Suji-gu<br>Yongin-si, South Korea</p><p id='11' style='font-size:20px'>Additional Request</p><br><p id='12' style='font-size:16px'>Vivamus vestibulum sagittis sapien. Cum sociis natoque<br>penatibus et magnis dis parturient montes, nascetur ridiculus<br>mus.</p><p id='13' style='font-size:14px'>TERMS AND CONDITIONS</p><p id='14' style='font-size:14px'>1. The Seller shall not be liable to the Buyer directly or indirectly for any loss or damage suffered by the Buyer.<br>2. The Seller warrants the product for one (1) year from the date of shipment.<br>3. Any purchase order received by the seller will be interpreted as accepting this offer and the sale offer in writing. The buyer may<br>purchase the product in this offer only under the Terms and Conditions of the Seller included in this offer.</p>",
    "metadata": {
        "pages": [
            {
                "height": 1270,
                "page": 1,
                "width": 1308
            }
        ]
    },
    "mimetype": "multipart/form-data",
    "model": "layout-analysis-0.2.1",
    "text": "INVOICE\nCompany\nUpstage\nInvoice ID # INV-AJ355548\nName\nLucy Park\nAddress\n7 Pepper Wood Street, 130 Stone Corner\nTerrace\nWilkes Barre, Pennsylvania, 18768\nUnited States\nEmail\nIkitchenman0@arizona.edu\nInvoice Date 9/7/1992\nService Details Form\nName\nSung Kim\nAddress\nGwanggyojungang-ro 338, Gyeonggi-do,\nSanghyeon-dong, Suji-gu\nYongin-si, South Korea\nAdditional Request\nVivamus vestibulum sagittis sapien. Cum sociis natoque\npenatibus et magnis dis parturient montes, nascetur ridiculus\nmus.\nTERMS AND CONDITIONS\n1. The Seller shall not be liable to the Buyer directly or indirectly for any loss or damage suffered by the Buyer.\n2. The Seller warrants the product for one (1) year from the date of shipment.\n3. Any purchase order received by the seller will be interpreted as accepting this offer and the sale offer in writing. The buyer may\npurchase the product in this offer only under the Terms and Conditions of the Seller included in this offer."
}