Fully Clickable Video Ad

OpenAI launches new tools to help businesses build AI agents | TechCrunch

Spread the love


On Tuesday, OpenAI released new tools designed to help developers and enterprises build AI agents – automated systems that can independently accomplish tasks – using the company’s own AI models and frameworks.

The tools are part of OpenAI’s new Responses API, which lets businesses develop custom AI agents that can perform web searches, scan through company files, and navigate websites, much like OpenAI’s Operator product. The Responses API effectively replaces OpenAI’s Assistants API, which the company plans to sunset in the first half of 2026.

The hype around AI agents has grown dramatically in recent years despite the fact that the tech industry has struggled to show people, or even define, what “AI agents” really are. In the most recent example of agent hype running ahead of utility, Chinese startup Butterfly Effect earlier this week went viral for a new AI agent platform called Manus that users quickly discovered didn’t deliver on many of the company’s promises.

In other words, the stakes are high for OpenAI to get agents right.

Blinking Photo Ad

“It’s pretty easy to demo your agent,” Olivier Godemont, OpenAI’s API product head, told TechCrunch in an interview. “To scale an agent is pretty hard, and to get people to use it often is very hard.”

Earlier this year, OpenAI introduced two AI agents in ChatGPT: Operator, which navigates websites on your behalf, and deep research, which compiles research reports for you. Both tools offered a glimpse at what agentic technology can achieve, but left quite a bit to be desired in the “autonomy” department.

See also  The US Army Is Using ‘CamoGPT’ to Purge DEI From Training Materials

Now with the Responses API, OpenAI wants to sell access to the components that power AI agents, allowing developers to build their own Operator- and deep research-style agentic applications. OpenAI hopes that developers can create some applications with its agent technology that feel more autonomous than what’s available today.

Using the Responses API, developers can tap the same AI models (in preview) under the hood of OpenAI’s ChatGPT Search web search tool: GPT-4o search and GPT-4o mini search. The models can browse the web for answers to questions, citing sources as they generate replies.

OpenAI claims that GPT-4o search and GPT-4o mini search are highly factually accurate. On the company’s SimpleQA benchmark, which measures the ability of models to answer short, fact-seeking questions, GPT-4o search scores 90% while GPT-4o mini search scores 88% (higher is better). For comparison, GPT-4.5 – OpenAI’s much larger, recently released model – scores just 63%.

The fact that AI-powered search tools are more accurate than traditional AI models is not necessarily surprising – in theory, GPT-4o search can just look up the right answer. However, web search does not render hallucinations a solved problem. Beyond their factual accuracy, AI search tools also tend to struggle with short, navigational queries (such as “Lakers score today”), and recent reports suggest that ChatGPT’s citations aren’t always reliable.

The Responses API also includes a file search utility that can quickly scan across files in a company’s databases to retrieve information. (OpenAI claims that it won’t train models on these files.) In addition, developers using the Responses API can tap OpenAI’s Computer-Using Agent (CUA) model, which powers Operator. The model generates mouse and keyboard actions, allowing developers to automate computer use tasks like data entry and app workflows.

See also  Alaskan Volcano Likely to Erupt in the Next Weeks or Months, Experts Warn

Enterprises can optionally run the CUA model, which is releasing in research preview, locally on their own systems, OpenAI said. The consumer version of the CUA available in Operator can only take actions on the web.

To be clear, the Responses API won’t solve all the technical problems plaguing AI agents today.

While AI-powered search tools are more accurate than traditional AI models – a fact that is unsurprising given they can just look up the right answer – web search does not render AI hallucinations a solved problem. GPT-4o search still gets 10% of factual questions wrong. Beyond their accuracy, AI search tools also tend to struggle with short, navigational queries (such as “Lakers score today”), and recent reports suggest that ChatGPT’s citations aren’t always reliable.

In a blog post provided to TechCrunch, OpenAI said that the CUA model is “not yet highly reliable for automating tasks on operating systems,” and that it’s susceptible to making “inadvertent” mistakes.

However, OpenAI said these are early iterations of their agent tools, and it’s constantly working to improve them.

Alongside the Responses API, OpenAI is releasing an open-source toolkit called the Agents SDK, which offers developers free tools to integrate models with their internal systems, put in place safeguards, and monitor AI agent activities for debugging and optimization purposes. The Agents SDK is a follow-up of sorts to OpenAI’s Swarm, a framework for multi-agent orchestration that the company released late last year.

Godemont said he hopes OpenAI can bridge the gap between AI agent demos and products this year, and that, in his opinion, “agents are the most impactful application of AI that will happen.” That echoes a proclamation OpenAI CEO Sam Altman made in January: that 2025 is the year AI agents enter the workforce.

See also  Neom is reportedly turning into a financial disaster, except for McKinsey & Co. | TechCrunch

Whether or not 2025 truly becomes the “year of the AI agent,” OpenAI’s latest releases show the company wants to shift from flashy agent demos to impactful tools.

Related Posts
Kiren Rijiju: Why Earth Sciences minister Rijiju is upset with this European IT company | – Times of India

Earth Sciences Minister Kiren Rijiju is reportedly upset with the French IT company Atos. Reason is said to be Read more

Former Activision boss reportedly wants to buy TikTok – Times of India
Former Activision boss reportedly wants to buy TikTok - Times of India

Bobby Kotick, the former head of Activision Blizzard, is reportedly considering buying TikTok, as the app could be banned Read more

How Apple’s Find My app ‘cost’ a US city millions of dollars – Times of India
How Apple’s Find My app ‘cost’ a US city millions of dollars - Times of India

Apple's Find My app has cost the city of Denver, US $3.76 million in compensation and damages. In 2022, Read more

Moto G54 receives a price cut in India: Here’s how much the smartphone costs – Times of India
Moto G54 receives a price cut in India: Here’s how much the smartphone costs - Times of India

If you have been planing to purchase a budget smartphone, then you can consider buying the Moto G54. Launched Read more

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top