OpenAI's ambitious GPT-5 project, codenamed Orion, encounters significant challenges, including high costs, data shortages, and technical hurdles. Analysts question whether the project can meet expectations for the next leap in AI.
The next leap in AI seems to have been delayed.
On December 20 (local time), The Wall Street Journal reported that OpenAI's next-generation AI project, GPT-5 (codenamed Orion), is facing significant challenges. Despite over 18 months of development and enormous costs, the project has not achieved its expected outcomes.
According to insiders, OpenAI's largest backer, Microsoft, initially anticipated seeing the new model by mid-2024. OpenAI has conducted at least two large-scale training sessions, each taking months and consuming massive datasets, but each session uncovered new issues, with the software failing to meet researchers' expectations.
Analysts speculate that there may not be enough data in the world to make the model sufficiently intelligent.
1. Enormous Costs and Slow Progress
Analysts previously estimated that tech giants might invest $1 trillion in AI projects over the coming years. Training GPT-5 for just six months could cost approximately $500 million, with OpenAI CEO Sam Altman stating that future AI models might cost over $1 billion.
However, insiders noted:
“While Orion performs better than OpenAI's current products, it does not justify its enormous operating costs.”
In October, OpenAI received a valuation of $157 billion, largely based on Altman’s predictions that GPT-5 would represent a "major leap forward." Altman compared GPT-4’s capabilities to a smart high school student, whereas GPT-5 was expected to perform like a PhD-holder on certain tasks.
Reports suggest that GPT-5 was intended to unlock new scientific discoveries and handle routine human tasks like scheduling appointments or booking flights. Researchers hoped it would make fewer errors than existing AI or, at the very least, acknowledge uncertainty—a challenge for current models prone to generating hallucinations.
Yet, there is no clear standard for defining "intelligent enough AI," and much of it remains subjective.
So far, the development of GPT-5 does not inspire confidence. In November, Altman stated, “No product named GPT-5 will be released in 2024.”
2. Data Shortages as the Major Bottleneck
To avoid wasting massive investments, researchers have tried to minimize failure risks through small-scale trial runs.
However, the GPT-5 plan seemed problematic from the start. By mid-2023, OpenAI initiated a training run to test Orion's proposed new design. Progress was slow, suggesting that larger-scale training could take extremely long and drive costs even higher.
OpenAI researchers determined that making Orion smarter would require more high-quality, diverse data. Testing the model is an ongoing process, with large-scale training runs possibly spanning months and feeding trillions of tokens into the model.
However, publicly available data—news articles, social media posts, scientific papers, and more—has become insufficient. Ari Morcos, CEO of DatologyAI, explained:
“It’s getting very expensive, and finding more data of equivalent quality is very difficult.”
To address this, OpenAI has opted to create data from scratch. They have hired software engineers and mathematicians to write new code or solve mathematical problems, using this as training data.
The company is also collaborating with experts in fields like theoretical physics to tackle complex problems in their domains, but this process is slow. GPT-4’s training involved about 13 trillion tokens; even if 1,000 people wrote 5,000 words daily, they would generate only 1 billion tokens in a few months.
OpenAI has also begun developing "synthetic data," using AI-generated content to train Orion. They believe using data from another AI model, o1, might avoid training failures.
3. Competition from Google and OpenAI's Diversion of Resources
This year, Google’s release of the popular AI application NotebookLM has heightened the pressure on OpenAI.
With Orion stagnating, the company has shifted focus to other projects and applications, including a streamlined version of GPT-4 and Sora, an AI capable of generating videos. Insiders revealed that this has led to competition for limited computational resources between new product teams and Orion researchers.
Additionally, OpenAI is developing more advanced reasoning models, hoping that allowing AI to “think” for longer can address complex, novel problems.
However, these new strategies face obstacles. Apple researchers found that reasoning models like OpenAI’s o1 might merely mimic training data instead of genuinely solving new problems. Moreover, o1’s approach of generating multiple answers significantly increases operating costs.
Despite the challenges, OpenAI remains committed to developing GPT-5. On Friday, Altman announced plans for a new reasoning model that would be smarter than any previous product but did not disclose when—or if—a model worthy of being called GPT-5 would launch.