xAI introduces Grok-1.5V AI model with image processing capabilities: Check details
xAI has introduced its new Grok-1.5 Vision or Grok-1.5V AI model.
Grok-1.5V is xAI's first multimodal model.
Grok-1.5V will be available soon to xAI’s early testers and existing Grok users.
Elon Musk’s xAI has introduced its new Grok-1.5 Vision or Grok-1.5V AI model. Grok-1.5V is the company’s first multimodal model. In addition to its text capabilities, Grok can now understand a wide variety of visual information, including documents, diagrams, charts, screenshots, and photographs. It’s important to note that Grok-1.5V has not yet been released and will be available soon to xAI’s early testers and existing Grok users.
Let’s delve into the capabilities of the Grok-1.5V AI model.
Also read: Elon Musk’s Grok is now available for X premium users too! Here’s how to use it
Grok-1.5V: Capabilities
Grok-1.5V is competitive with existing frontier multimodal models in a number of domains, ranging from multi-disciplinary reasoning to understanding documents, science diagrams, charts, screenshots, and photographs.
Its performance surpasses that of its peers in xAI’s new RealWorldQA benchmark, which evaluates real-world spatial understanding.
Also read: What is grok and why it is different from ChatGPT? Find out
RealWorldQA
xAI has introduced a new benchmark called RealWorldQA. This benchmark is designed to evaluate basic real-world spatial understanding capabilities of multimodal models. While many of the examples in the current benchmark are relatively easy for humans, they often pose a challenge for frontier models.
The initial release of the RealWorldQA consists of over 700 images, with a question and easily verifiable answer for each image. The dataset consists of anonymised images taken from vehicles, in addition to other real-world images.
xAI stated that advancing both its multimodal understanding and generation capabilities are important steps in building beneficial AGI that can understand the universe. In the coming months, xAI anticipates making significant improvements in both capabilities across various modalities such as images, audio, and video.
It seems like Elon Musk is trying hard to fight the competition with his chatbot. He sees his competition with OpenAI’s ChatGPT, Google’s Gemini, or Anthropic’s Claude.
Ayushi Jain
Tech news writer by day, BGMI player by night. Combining my passion for tech and gaming to bring you the latest in both worlds. View Full Profile