Models

Upstage is at the forefront of developing a suite of AI models tailored for diverse business needs, such as Solar LLM and Document AI with Upstage's mission to achieve AGI (Artificial General Intelligence) for work.

Solar LLM

Upstage Solar is a compact yet powerful large-language model (LLM).

Model	Release date	Context Length	Description
solar-1-mini-chat	2024-06-14 `beta`	32768	A compact LLM offering superior performance to GPT-3.5, with robust multilingual capabilities for both English and Korean, delivering high efficiency in a smaller package. `solar-1-mini-chat` is alias for our latest solar-1-mini-chat model. (Currently `solar-1-mini-chat-240612`)
solar-1-mini-chat-ja	2024-06-14 `beta`	32768	A compact LLM that extends the capabilities of solar-mini-chat with specialization in Japanese, while maintaining high efficiency and performance in English and Korean. `solar-1-mini-chat-ja` is alias for our latest solar-1-mini-chat-ja model. (Currently `solar-1-mini-chat-ja-240612`)
solar-embedding-1-large-query	2024-05-10 `beta`	4000	Solar-base Query Embedding model with a 4k context limit. This model is optimized for embedding user's question in information-seeking tasks such as retrieval & reranking.
solar-embedding-1-large-passage	2024-05-10 `beta`	4000	Solar-base Passage Embedding model with a 4k context limit. This model is optimized for embedding documents or texts to be searched.
solar-1-mini-translate-enko	2024-05-22 `beta`	32768	English-to-Korean translation specialized model based on the solar-mini. Maximum context length is 32k tokens. `solar-1-mini-translate-enko` is alias for our latest solar-1-mini-translate-enko model. (Currently `solar-1-mini-translate-enko-240507`)
solar-1-mini-translate-koen	2024-05-22 `beta`	32768	Korean-to-English translation specialized model based on the solar-mini. Maximum context length is 32k tokens. `solar-1-mini-translate-koen` is alias for our latest solar-1-mini-translate-koen model. (Currently `solar-1-mini-translate-koen-240507`)
solar-1-mini-groundedness-check	2024-05-02 `beta`	32768	Solar-based groundedness check model with a 32k context limit. `solar-1-mini-groundedness-check` is alias for our latest solar-1-mini-groundedness-check model. (Currently `solar-1-mini-groundedness-check-240502`)

For details about the model architecture, see this paper (opens in a new tab).

Document OCR

Extract all text from any document.

Model	Availability	Release date	Description
ocr-2.2.1	Latest	2024-06-11	Additional support for Japanese character set.
ocr-2.1.1	Deprecated	2024-04-04	Improved text detection for single characters and special characters.
ocr-2.1.0	Deprecated	2024-02-28	Additional support for Hanja, Hanzi and Kanji. Improved accuracy and performance.
ocr-1.0.0	Deprecated	2023-04-10	An OCR model specialized for English and Korean. Resilient against real-world images, including wrinkled papers and rotated text.

Layout Analysis

Extract tables and figures from any document.

Model	Availability	Release date	Description
layout-analysis-0.4.0	Latest	2024-07-04 `beta`	Improved the accuracy for table recognition. Added new layout elements: `heading1`, `list`, `index`, and `footnote`. Changed the default value for `ocr` field to `false`
layout-analysis-0.3.1	Deprecated	2024-06-17 `beta`	Fixed a bug where extracted text from table elements was truncated.
layout-analysis-0.3.0	Deprecated	2024-06-11 `beta`	Improved the inference speed by 2x for digital-born PDF documents.
layout-analysis-0.2.1	Deprecated	2024-05-02 `beta`	Removed unnecessary `<thead>` tags from table elements and fixed bugs.
layout-analyzer-0.2.0	Deprecated	2024-04-04 `beta`	Improved the accuracy for table recognition and performance for layout detection.
layout-analyzer-0.1.0	Deprecated	2024-02-28 `beta`	A layout analyzer model which detects elements within a document, recognizes tables, and serializes elements according to reading order.

Key Information Extraction

Extract key information from target documents.

Model	Availability	Release date	Description
air-waybill-extraction-4.1.6	Latest	2024-06-11 `beta`	An extractor model for air waybill (AWB)
bill-of-lading-and-shipping-request-extraction-4.1.6	Latest	2024-06-11 `beta`	An consolidated extractor model for Bill of lading (BL or BoL) and Shipping request (SR).
commercial-invoice-and-packing-list-extraction-4.1.6	Latest	2024-06-11 `beta`	An consolidated extractor model for Commercial invoice (CI) and Packing list (PL).
kr-export-declaration-certificate-extraction-4.1.6	Latest	2024-06-11 `beta`	An extractor model for Korea export declaration certificate.
receipt-extraction-3.2.0	Latest	2024-04-11	Additional support for English. Improved accuracy and performance.
receipt-extractor-1.0.0	Deprecated	2023-04-11	An extractor model for paper receipts, that include store descriptions and list of items. Works best for Korean receipts.

Quick start Create your own chatbot