LG AI Research on April 9 unveiled EXAONE 4.5, a multimodal artificial intelligence model that can understand and reason across both text and images.
Multimodal AI systems are designed to process different types of data, including text, images, audio and video, within a single framework.
EXAONE 4.5 combines a proprietary vision encoder with a large language model, allowing it to interpret written and visual information together. The model is tailored for complex, real-world materials such as contracts, technical drawings and financial statements.
LG AI Research said the release marks a step forward for its proprietary foundation model, K-EXAONE, expanding its ability to handle a wider range of data formats.
In benchmark tests, the model showed strong performance. It posted an average score of 77.3 across five science, technology, engineering and mathematics indicators, outperforming GPT-5 mini at 73.5, Claude Sonnet 4.5 at 74.6 and Qwen3 at 77.0.
EXAONE 4.5 also outperformed GPT-5 mini and Claude Sonnet 4.5 in average scores across 13 visual capability benchmarks. LG AI Research said the results show the model can understand the context of both images and text and respond to questions with integrated reasoning.
이민아 omg@donga.com