Fully Clickable Video Ad

Mistral adds a new API that turns any PDF document into an AI-ready Markdown file | TechCrunch

Spread the love


On Thursday French large language model (LLM) developer Mistral launched a new API for developers who handle complex PDF documents. Mistral OCR is an optical character recognition (OCR) API that can turn any PDF into a text file to make it easier for AI models to ingest.

LLMs, which underpin popular GenAI tools like OpenAI’s ChatGPT, work particularly well with raw text. So companies that want to create their own AI workflow know that it has become extremely important to store and index data in a clean format so that this data can be reused for AI processing.

Unlike most OCR APIs, Mistral OCR is a multimodal API, meaning that it can detect when there are illustrations and photos intertwined with blocks of text. The OCR API creates bounding boxes around these graphical elements and includes them in the output.

Mistral OCR also doesn’t just output a big wall of text; the output is formatted in Markdown, a formatting syntax that developers use to add links, headers, and other formatting elements to a plain text file.

Blinking Photo Ad

LLMs rely heavily on Markdown for their training datasets. Similarly, when you use an AI assistant, such as Mistral’s Le Chat or OpenAI’s ChatGPT, they often generate Markdown to create bullet lists, add links, or put some elements in bold. Assistant apps seamlessly format the Markdown output into a rich text output. That’s why raw text — and Markdown — have become more important in recent years as GenAI has boomed.

“Over the years, organizations have accumulated numerous documents, often in PDF or slide formats, which are inaccessible to LLMs, particularly RAG systems. With Mistral OCR, our customers can now convert rich and complex documents into readable content in all languages,” said Mistral co-founder and chief science officer Guillaume Lample.

See also  Apple fixes new security flaw used in 'extremely sophisticated attack' | TechCrunch

“This is a crucial step toward the widespread adoption of AI assistants in companies that need to simplify access to their vast internal documentation,” he added.

Mistral OCR is available on Mistral’s own API platform or through its cloud partners (AWS, Azure, Google Cloud Vertex, etc.). And for companies working with classified or sensitive data, Mistral offers on-premise deployment.

According to the Paris-based AI company, Mistral OCR performs better than APIs from Google, Microsoft, and OpenAI. The company has tested its OCR model with complex documents that include mathematical expressions (LaTeX formatting), advanced layouts, or tables. It is also supposed to perform better with non-English documents.

Image Credits:Mistral

Given that Mistral OCR does one thing and one thing only, the company believes it is also faster than what’s out there. That’s not a surprise if you compare it with a multimodal LLM like GPT-4o, which also has OCR capabilities (among many other features).

Mistral is also using Mistral OCR for its own AI assistant Le Chat. When a user uploads a PDF file, the company uses Mistral OCR in the background to understand what’s in the document before processing the text.

Companies and developers will most likely use Mistral OCR with a RAG (aka Retrieval-Augmented Generation) system to use multimodal documents as input in an LLM. And there are many potential use cases. For instance, we could envisage law firms using it to help them swiftly plough through huge volumes of documents.

RAG is a technique that’s used to retrieve data and use it as context with a generative AI model.

See also  AI coding assistant Cursor reportedly tells a 'vibe coder' to write his own damn code | TechCrunch
Related Posts
Kiren Rijiju: Why Earth Sciences minister Rijiju is upset with this European IT company | – Times of India

Earth Sciences Minister Kiren Rijiju is reportedly upset with the French IT company Atos. Reason is said to be Read more

Former Activision boss reportedly wants to buy TikTok – Times of India
Former Activision boss reportedly wants to buy TikTok - Times of India

Bobby Kotick, the former head of Activision Blizzard, is reportedly considering buying TikTok, as the app could be banned Read more

How Apple’s Find My app ‘cost’ a US city millions of dollars – Times of India
How Apple’s Find My app ‘cost’ a US city millions of dollars - Times of India

Apple's Find My app has cost the city of Denver, US $3.76 million in compensation and damages. In 2022, Read more

Moto G54 receives a price cut in India: Here’s how much the smartphone costs – Times of India
Moto G54 receives a price cut in India: Here’s how much the smartphone costs - Times of India

If you have been planing to purchase a budget smartphone, then you can consider buying the Moto G54. Launched Read more

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top