Models
Upstage is at the forefront of developing a suite of AI models tailored for diverse business needs, such as Solar LLM and Document AI with Upstage's mission to achieve AGI (Artificial General Intelligence) for work.
Solar LLM
Upstage Solar is a compact yet powerful large-language model (LLM).
Model | Release date | Context Length | Description |
---|---|---|---|
solar-1-mini-chat | 2024-06-14 beta | 32768 | A compact LLM offering superior performance to GPT-3.5, with robust multilingual capabilities for both English and Korean, delivering high efficiency in a smaller package. solar-1-mini-chat is alias for our latest solar-1-mini-chat model. (Currently solar-1-mini-chat-240612 ) |
solar-1-mini-chat-ja | 2024-06-14 beta | 32768 | A compact LLM that extends the capabilities of solar-mini-chat with specialization in Japanese, while maintaining high efficiency and performance in English and Korean. solar-1-mini-chat-ja is alias for our latest solar-1-mini-chat-ja model. (Currently solar-1-mini-chat-ja-240612 ) |
solar-embedding-1-large-query | 2024-05-10 beta | 4000 | Solar-base Query Embedding model with a 4k context limit. This model is optimized for embedding user's question in information-seeking tasks such as retrieval & reranking. |
solar-embedding-1-large-passage | 2024-05-10 beta | 4000 | Solar-base Passage Embedding model with a 4k context limit. This model is optimized for embedding documents or texts to be searched. |
solar-1-mini-translate-enko | 2024-05-22 beta | 32768 | English-to-Korean translation specialized model based on the solar-mini. Maximum context length is 32k tokens. solar-1-mini-translate-enko is alias for our latest solar-1-mini-translate-enko model. (Currently solar-1-mini-translate-enko-240507 ) |
solar-1-mini-translate-koen | 2024-05-22 beta | 32768 | Korean-to-English translation specialized model based on the solar-mini. Maximum context length is 32k tokens. solar-1-mini-translate-koen is alias for our latest solar-1-mini-translate-koen model. (Currently solar-1-mini-translate-koen-240507 ) |
solar-1-mini-groundedness-check | 2024-05-02 beta | 32768 | Solar-based groundedness check model with a 32k context limit. solar-1-mini-groundedness-check is alias for our latest solar-1-mini-groundedness-check model. (Currently solar-1-mini-groundedness-check-240502 ) |
For details about the model architecture, see this paper (opens in a new tab).
Document OCR
Extract all text from any document.
Model | Availability | Release date | Description |
---|---|---|---|
ocr-2.2.1 | Latest | 2024-06-11 | Additional support for Japanese character set. |
ocr-2.1.1 | Deprecated | 2024-04-04 | Improved text detection for single characters and special characters. |
ocr-2.1.0 | Deprecated | 2024-02-28 | Additional support for Hanja, Hanzi and Kanji. Improved accuracy and performance. |
ocr-1.0.0 | Deprecated | 2023-04-10 | An OCR model specialized for English and Korean. Resilient against real-world images, including wrinkled papers and rotated text. |
Layout Analysis
Extract tables and figures from any document.
Model | Availability | Release date | Description |
---|---|---|---|
layout-analysis-0.4.0 | Latest | 2024-07-04 beta | Improved the accuracy for table recognition. Added new layout elements: heading1 , list , index , and footnote . Changed the default value for ocr field to false |
layout-analysis-0.3.1 | Deprecated | 2024-06-17 beta | Fixed a bug where extracted text from table elements was truncated. |
layout-analysis-0.3.0 | Deprecated | 2024-06-11 beta | Improved the inference speed by 2x for digital-born PDF documents. |
layout-analysis-0.2.1 | Deprecated | 2024-05-02 beta | Removed unnecessary <thead> tags from table elements and fixed bugs. |
layout-analyzer-0.2.0 | Deprecated | 2024-04-04 beta | Improved the accuracy for table recognition and performance for layout detection. |
layout-analyzer-0.1.0 | Deprecated | 2024-02-28 beta | A layout analyzer model which detects elements within a document, recognizes tables, and serializes elements according to reading order. |
Key Information Extraction
Extract key information from target documents.
Model | Availability | Release date | Description |
---|---|---|---|
air-waybill-extraction-4.1.6 | Latest | 2024-06-11 beta | An extractor model for air waybill (AWB) |
bill-of-lading-and-shipping-request-extraction-4.1.6 | Latest | 2024-06-11 beta | An consolidated extractor model for Bill of lading (BL or BoL) and Shipping request (SR). |
commercial-invoice-and-packing-list-extraction-4.1.6 | Latest | 2024-06-11 beta | An consolidated extractor model for Commercial invoice (CI) and Packing list (PL). |
kr-export-declaration-certificate-extraction-4.1.6 | Latest | 2024-06-11 beta | An extractor model for Korea export declaration certificate. |
receipt-extraction-3.2.0 | Latest | 2024-04-11 | Additional support for English. Improved accuracy and performance. |
receipt-extractor-1.0.0 | Deprecated | 2023-04-11 | An extractor model for paper receipts, that include store descriptions and list of items. Works best for Korean receipts. |