Key Information Extraction
Extract key information from target documents.
Available document types
- Receipt
- Air waybill
- Bill of lading
- Shipping reqeust
- Commercial invoice
- Packing list
- Export declaration certificate(KR)
Request
POST https://api.upstage.ai/v1/document-ai/extraction
Parameters
Request headers
Authorization string Required |
Request body
document file Required |
model string Required |
Requirements
- Supported file formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX
- Maximum file size: 50MB
- Maximum number of pages per file: 30 pages (For files exceeding 30 pages, the first 30 pages are processed)
- Maximum pixels per page: 100,000,000 pixels. For non-image files, the pixel count is determined after converting to images at a standard of 300 DPI.
- Supported character sets: Alphanumeric, Hangul, and Hanja are supported. Hanzi and Kanji are in beta versions, indicating that they are available but not fully supported.
Hanja, Hanzi, and Kanji are writing systems based on Chinese characters used in Korean, Chinese, and Japanese writing systems. Despite sharing similarities, they possess distinct visual representations, pronunciations, meanings, and usage conventions within their respective linguistic contexts. For more information, see this article.
Response
Functionality overview
- Unknown characters: Characters that the model detects but cannot recognize are considered "unknown characters" and are marked by the character
�
. - Response time: Files with less than 30 words typically take approximately three seconds. Longer documents can take up to tens of seconds.
- Timeout: There is a server-side 3 minute timeout for all requests.
Return values
mimeType string The MIME type of the input file (e.g., "multipart/form-data"). |
documentType string The document type of the input file (e.g., "receipt"). |
fields list A list of field objects containing information about the extracted key-value pairs, their types, values, and confidence scores. |
*.key string The key name of the extracted field. |
*.type string The data type of the extracted field (e.g., "date", "monetary_usd") |
*.value string The extracted value of the field. |
*.confidence float A float value between 0 and 1, representing the confidence score for the value's extraction. A higher value indicates greater confidence in the accuracy of the field. |
*.id integer The unique identifier for the field. |
*.refinedValue string (Optional) A refined version of the extracted value (e.g., a standardized date format). |
*.properties list (Optional) A list of field objects containing information about the subfields within a group. Different indexes represent different groups. Each group contains subfields with their types, values, and confidence scores. |
stored boolean A boolean indicating whether the input was stored. If true, the data has been stored. If false the data has been discarded instantly. |
modelVersion string A string representing the version of the model being used. |
apiVersion string A string representing the version of the API being used. A bump in the major version indicates a backward-incompatible update, while a minor version increase signifies a backward-compatible update. |
numBilledPages integer The total count of pages in the input file that have been processed and are chargeable. |
metadata object An object containing metadata about a document, such as page size and page numbers. |
Example
Request
curl -X POST https://api.upstage.ai/v1/document-ai/extraction \
-H "Authorization: Bearer UPSTAGE_API_KEY" \
-F "document=@receipt.jpg" \
-F "model=receipt-extraction"
Response
{
"apiVersion": "1.1",
"confidence": 0.8693423350242669,
"documentType": "receipt",
"fields": [
{
"id": 0,
"key": "group_0",
"properties": [
{
"confidence": 0.9924821257591248,
"id": 1,
"key": "group_0/menu.unit_product_quantity",
"refinedValue": "3",
"type": "content",
"value": "3"
},
{
"confidence": 0.9585183839652158,
"id": 2,
"key": "group_0/menu.product_name",
"refinedValue": "Classic Mojito",
"type": "content",
"value": "Classic Mojito"
},
{
"confidence": 0.9199805958380287,
"id": 3,
"key": "group_0/menu.unit_product_total_price_before_discount",
"refinedValue": "45.00",
"type": "content",
"value": "$45 00"
}
],
"refinedValue": "",
"type": "group",
"value": "3 Classic Mojito $45 00"
},
{
"id": 4,
"key": "group_0",
"properties": [
{
"confidence": 0.9604779481887817,
"id": 5,
"key": "group_0/menu.unit_product_quantity",
"refinedValue": "1",
"type": "content",
"value": "1"
},
{
"confidence": 0.9843973246343797,
"id": 6,
"key": "group_0/menu.product_name",
"refinedValue": "Margarita Classic",
"type": "content",
"value": "Margarita Classic"
},
{
"confidence": 0.9064315192491437,
"id": 7,
"key": "group_0/menu.unit_product_total_price_before_discount",
"refinedValue": "15.00",
"type": "content",
"value": "$15 00"
}
],
"refinedValue": "",
"type": "group",
"value": "1 Margarita Classic $15 00"
},
{
"id": 8,
"key": "group_0",
"properties": [
{
"confidence": 0.9691067934036255,
"id": 9,
"key": "group_0/menu.unit_product_quantity",
"refinedValue": "1",
"type": "content",
"value": "1"
},
{
"confidence": 0.5245915964832683,
"id": 10,
"key": "group_0/menu.product_name",
"refinedValue": "Parrillada Yuca con Mcjo Yuca Fritas Medium",
"type": "content",
"value": "Parrillada Yuca con Mcjo Yuca Fritas Medium"
},
{
"confidence": 0.9572980538130436,
"id": 11,
"key": "group_0/menu.unit_product_total_price_before_discount",
"refinedValue": "90.00",
"type": "content",
"value": "$90 00"
}
],
"refinedValue": "",
"type": "group",
"value": "1 Parrillada Yuca con Mcjo Yuca Fritas Medium $90 00"
},
{
"id": 12,
"key": "group_0",
"properties": [
{
"confidence": 0.9693443179130554,
"id": 13,
"key": "group_0/menu.unit_product_quantity",
"refinedValue": "1",
"type": "content",
"value": "1"
},
{
"confidence": 0.8845985036390739,
"id": 14,
"key": "group_0/menu.product_name",
"refinedValue": "Hemingway S Ma lecon",
"type": "content",
"value": "Hemingway S Ma lecon"
},
{
"confidence": 0.8875203900519126,
"id": 15,
"key": "group_0/menu.unit_product_total_price_before_discount",
"refinedValue": "15.00",
"type": "content",
"value": "$15.00"
}
],
"refinedValue": "",
"type": "group",
"value": "1 Hemingway S Ma lecon $15.00"
},
{
"id": 16,
"key": "group_0",
"properties": [
{
"confidence": 0.977379322052002,
"id": 17,
"key": "group_0/menu.unit_product_quantity",
"refinedValue": "1",
"type": "content",
"value": "1"
},
{
"confidence": 0.9403543115198569,
"id": 18,
"key": "group_0/menu.product_name",
"refinedValue": "Strawberry Moj ito",
"type": "content",
"value": "Strawberry Moj ito"
},
{
"confidence": 0.9263305215507435,
"id": 19,
"key": "group_0/menu.unit_product_total_price_before_discount",
"refinedValue": "15.00",
"type": "content",
"value": "$15 00"
}
],
"refinedValue": "",
"type": "group",
"value": "1 Strawberry Moj ito $15 00"
},
{
"confidence": 0.9819169006082817,
"id": 20,
"key": "store.store_name",
"refinedValue": "HAVANA VIEJA",
"type": "content",
"value": "HAVANA VIEJA"
},
{
"confidence": 0.6618761453146255,
"id": 21,
"key": "store.store_address",
"refinedValue": "944 Washington Ave Miami Beach F1. 33139",
"type": "content",
"value": "944 Washington Ave Miami Beach F1. 33139"
},
{
"confidence": 0.9922102908265871,
"id": 22,
"key": "total.subtotal_price",
"refinedValue": "Subtotal",
"type": "header",
"value": "Subtotal"
},
{
"confidence": 0.7275943936665118,
"id": 23,
"key": "total.subtotal_price",
"refinedValue": "180.00",
"type": "content",
"value": "$180 .00"
},
{
"confidence": 0.2429200998999827,
"id": 24,
"key": "total.tip_price",
"refinedValue": "Gratuity (20 00%",
"type": "header",
"value": "Gratuity (20 00%"
},
{
"confidence": 0.8521402045045384,
"id": 25,
"key": "total.tip_price",
"refinedValue": "36.00",
"type": "content",
"value": "$36.00"
},
{
"confidence": 0.8664104635738139,
"id": 26,
"key": "total.tax_price",
"refinedValue": "16.20",
"type": "content",
"value": "$16 .20"
},
{
"confidence": 0.8695934498569926,
"id": 27,
"key": "total.tax_price",
"refinedValue": "Tax",
"type": "header",
"value": "Tax"
},
{
"confidence": 0.9746453544850778,
"id": 28,
"key": "total.charged_price",
"refinedValue": "Total",
"type": "header",
"value": "Total"
},
{
"confidence": 0.9548797788891511,
"id": 29,
"key": "total.charged_price",
"refinedValue": "232.20",
"type": "content",
"value": "$232.20"
}
],
"metadata": {
"pages": [
{
"height": 4032,
"page": 1,
"width": 3024
}
]
},
"mimeType": "multipart/form-data",
"modelVersion": "receipt-extraction-3.2.0",
"numBilledPages": 1,
"stored": true
}