AI TOOLS 2024.
After considerable anticipation, OpenAI has officially revealed their new o1-preview series of AI models, which can address challenging issues in fields like arithmetic, science, and coding. As part of an early preview, the models are now accessible through ChatGPT and the API; frequent upgrades and enhancements are anticipated.
“Very proud of the team; the entire company should be proud of this enormous effort.”Chief OpenAI Officer Sam Altman wrote on X, “I hope you enjoy it.” “No more patience, Jimmy,” he even concluded his inside joke with AI insider Jimmy Apples, to which Apples responded, “It feels good, Sam.” Excellent.
By training them to deliberate longer before reacting, the o1 series models enhance their capacity for problem-solving and sharpen their thought processes. The reasoning model’s next iteration produced impressive results in math and coding competitions, matching the performance of PhD students on tasks in physics, chemistry, and biology in early tests. Whereas GPT-4o scored 13%, the model achieved 83% in a qualifying exam for the International Mathematics Olympiad.
Despite its advanced reasoning abilities, the o1-preview model lacks some of the practical features found in GPT-4o, such as browsing the web and file uploading. However, OpenAI emphasises the model’s potential for tackling complex tasks, particularly in fields requiring multi-step workflows.
As part of the release, OpenAI has implemented a new safety training approach that allows the models to better follow safety rules. In jailbreaking tests, o1-preview outperformed GPT-4o, scoring 84 out of 100 compared to GPT-4o’s 22. OpenAI has also bolstered its safety efforts by partnering with AI safety institutes in the U.S. and U.K.
Alongside o1-preview, OpenAI has released a smaller, cost-effective model called o1-mini, specifically designed for developers who need advanced coding capabilities without broad world knowledge. o1-mini is 80% cheaper than o1-preview.
Starting today, ChatGPT Plus and Team users can manually select o1-preview and o1-mini from the model picker, with rate limits of 30 messages for o1-preview and 50 for o1-mini. API users in the highest usage tier can also begin prototyping, although some features like function calling and streaming are not yet available.
OpenAI plans to expand access to o1-mini for ChatGPT Free users and will continue adding new features to the o1 series, including browsing and file uploads.
Devin creator, Cognition Labs, worked closely with OpenAI over the last few weeks to evaluate OpenAI o1’s reasoning capabilities with Devin. They found that the new series of models represents a significant improvement for agentic systems that deal with code.
It All Builds Up to This
Altman, too, had hinted a few days earlier in a cryptic post that the company was working on a project known internally as Project Strawberry, also referred to as Q*.
“I love summer in the garden,” wrote Altman on X, posting the image of a terracotta pot containing a strawberry plant with lush green leaves and small, ripening strawberries.
Project Strawberry was said to significantly enhance the reasoning capabilities of OpenAI’s AI models. It is pretty clear that o1-preview is exclusively Strawberry. Meanwhile, OpenAI is also in talks to raise funds with an increased valuation of $150 billion. This funding round, led by Thrive Capital, would make Sam Altman’s Microsoft-backed company one of the most powerful startups in Silicon Valley.