Content.Fans
  • AI News & Trends
  • Business & Ethical AI
  • AI Deep Dives & Tutorials
  • AI Literacy & Trust
  • Personal Influence & Brand
  • Institutional Intelligence & Tribal Knowledge
No Result
View All Result
  • AI News & Trends
  • Business & Ethical AI
  • AI Deep Dives & Tutorials
  • AI Literacy & Trust
  • Personal Influence & Brand
  • Institutional Intelligence & Tribal Knowledge
No Result
View All Result
Content.Fans
No Result
View All Result
Home AI News & Trends

DeepSeekMath-V2 scores 118/120 on Putnam, achieves IMO Gold

Serge Bulaev by Serge Bulaev
December 1, 2025
in AI News & Trends
0
DeepSeekMath-V2 scores 118/120 on Putnam, achieves IMO Gold
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter

DeepSeekMath-V2 has achieved a gold-medal level at the International Mathematical Olympiad (IMO) and scored an astounding 118/120 on the Putnam exam, establishing a new frontier in AI-driven mathematical reasoning. Developed by DeepSeek AI, the model’s breakthrough performance is even more significant because it is an open-source system, providing researchers a transparent blueprint for large language models that prioritize verifiable proof over mere answers.

Competition scores that outpace humans

DeepSeekMath-V2 demonstrates superhuman performance in mathematics, securing a near-perfect 118/120 on the 2025 Putnam exam and matching the gold medal standard for the 2025 IMO. These results, achieved by an open-weights model, surpass top human scores and rival leading closed, proprietary AI systems.

Newsletter

Stay Inspired • Content.Fans

Get exclusive content creation insights, fan engagement strategies, and creator success stories delivered to your inbox weekly.

Join 5,000+ creators
No spam, unsubscribe anytime

The model’s 118/120 score on the 2025 Putnam exam far surpasses the top human score of 90, as reported by Marktechpost. It also verified 99% of proofs on the IMO-ProofBench Basic subset, outperforming Google’s Gemini DeepThink by 10 points. On the more challenging Advanced subset, it maintained 62% accuracy, as cited by Apidog.

A summary of its achievements:

  • IMO 2025: Gold medal standard, 5 of 6 full solutions
  • Putnam 2025: 118/120
  • IMO-ProofBench Basic: 99% success rate
  • Parameters: 685 billion mixture-of-experts

Self-verifiable architecture drives accuracy

The model’s high accuracy stems from a novel self-verifiable architecture. It pairs a powerful proof generator with a lightweight verifier that systematically checks each logical step by parsing it into an abstract syntax tree. This verifier acts as the reward model during training, compelling the generator to correct its own errors before producing a final output. At inference, DeepSeek scales this process by running up to 64 candidate proofs and 64 parallel verifications in a loop 16 times, a method shown to reduce error rates by 40% over baseline models. This dynamic closes the “generation-verification gap” that limited previous systems, while sparse attention architecture allows the 685B parameter model to maintain context over long, complex derivations.

Open weights reshape research landscape

In a significant move for the AI community, DeepSeek released the model’s weights on Hugging Face under the Apache 2.0 license. This decision challenges the trend of closed, proprietary development for frontier-scale systems. Now, academics and independent researchers can reproduce the landmark Olympiad results, conduct ablation studies on the verifier-first pipeline, and fine-tune specialist models without relying on pay-per-token APIs. While an open model now rivals top systems from Google and OpenAI on formal proof tasks, its practical deployment requires significant hardware, such as eight A100 GPUs. The model also shows room for improvement, trailing Gemini DeepThink slightly on the IMO-ProofBench Advanced split. Nonetheless, DeepSeekMath-V2 establishes a new, publicly accessible baseline: a model that outperforms elite human mathematicians and exposes its internal workings for all to scrutinize and build upon.

Serge Bulaev

Serge Bulaev

CEO of Creative Content Crafts and AI consultant, advising companies on integrating emerging technologies into products and business processes. Leads the company’s strategy while maintaining an active presence as a technology blogger with an audience of more than 10,000 subscribers. Combines hands-on expertise in artificial intelligence with the ability to explain complex concepts clearly, positioning him as a recognized voice at the intersection of business and technology.

Related Posts

xAI's Grok Imagine 0.9 Offers Free AI Video Generation
AI News & Trends

xAI’s Grok Imagine 0.9 Offers Free AI Video Generation

December 12, 2025
Hollywood Crew Sizes Fall 22.4% as AI Expands Film Production
AI News & Trends

Hollywood Crew Sizes Fall 22.4% as AI Expands Film Production

December 12, 2025
Microsoft Pumps $17.5B Into India for AI Infrastructure, Skilling 20M
AI News & Trends

Microsoft Pumps $17.5B Into India for AI Infrastructure, Skilling 20M

December 11, 2025
Next Post
McKinsey: 88% of Companies Adopt AI, But Only 39% See Value

McKinsey: 88% of Companies Adopt AI, But Only 39% See Value

Chinese Tech Giants Ship AI Training Offshore to Skirt US Chip Ban

Chinese Tech Giants Ship AI Training Offshore to Skirt US Chip Ban

Anthropic Unveils Claude Opus 4.5, Boosts AI Coding and Agent Abilities

Anthropic Unveils Claude Opus 4.5, Boosts AI Coding and Agent Abilities

Follow Us

Recommended

The AI Frontier: Johns Hopkins University Press and the New Era of Scholarly Licensing

The AI Frontier: Johns Hopkins University Press and the New Era of Scholarly Licensing

5 months ago
Google's NotebookLM Unveils Deep Research, Video Overviews in 2025 Upgrade

Google’s NotebookLM Unveils Deep Research, Video Overviews in 2025 Upgrade

3 weeks ago
ebcdic gdpr

EBCDIC, GDPR, and the Name Game: When Old Code Meets the Law

6 months ago
Inclusive AI: The New Frontier of Organizational Resilience

Inclusive AI: The New Frontier of Organizational Resilience

4 months ago

Instagram

    Please install/update and activate JNews Instagram plugin.

Categories

  • AI Deep Dives & Tutorials
  • AI Literacy & Trust
  • AI News & Trends
  • Business & Ethical AI
  • Institutional Intelligence & Tribal Knowledge
  • Personal Influence & Brand
  • Uncategorized

Topics

acquisition advertising agentic ai agentic technology ai-technology aiautomation ai expertise ai governance ai marketing ai regulation ai search aivideo artificial intelligence artificialintelligence businessmodelinnovation compliance automation content management corporate innovation creative technology customerexperience data-transformation databricks design digital authenticity digital transformation enterprise automation enterprise data management enterprise technology finance generative ai googleads healthcare leadership values manufacturing prompt engineering regulatory compliance retail media robotics salesforce technology innovation thought leadership user-experience Venture Capital workplace productivity workplace technology
No Result
View All Result

Highlights

New AI workflow slashes fact-check time by 42%

XenonStack: Only 34% of Agentic AI Pilots Reach Production

Microsoft Pumps $17.5B Into India for AI Infrastructure, Skilling 20M

GEO: How to Shift from SEO to Generative Engine Optimization in 2025

New Report Details 7 Steps to Boost AI Adoption

New AI Technique Executes Million-Step Tasks Flawlessly

Trending

xAI's Grok Imagine 0.9 Offers Free AI Video Generation
AI News & Trends

xAI’s Grok Imagine 0.9 Offers Free AI Video Generation

by Serge Bulaev
December 12, 2025
0

xAI's Grok Imagine 0.9 provides powerful, free AI video generation, allowing creators to produce highquality, watermarkfree clips...

Hollywood Crew Sizes Fall 22.4% as AI Expands Film Production

Hollywood Crew Sizes Fall 22.4% as AI Expands Film Production

December 12, 2025
Resops AI Playbook Guides Enterprises to Scale AI Adoption

Resops AI Playbook Guides Enterprises to Scale AI Adoption

December 12, 2025
New AI workflow slashes fact-check time by 42%

New AI workflow slashes fact-check time by 42%

December 11, 2025
XenonStack: Only 34% of Agentic AI Pilots Reach Production

XenonStack: Only 34% of Agentic AI Pilots Reach Production

December 11, 2025

Recent News

  • xAI’s Grok Imagine 0.9 Offers Free AI Video Generation December 12, 2025
  • Hollywood Crew Sizes Fall 22.4% as AI Expands Film Production December 12, 2025
  • Resops AI Playbook Guides Enterprises to Scale AI Adoption December 12, 2025

Categories

  • AI Deep Dives & Tutorials
  • AI Literacy & Trust
  • AI News & Trends
  • Business & Ethical AI
  • Institutional Intelligence & Tribal Knowledge
  • Personal Influence & Brand
  • Uncategorized

Custom Creative Content Soltions for B2B

No Result
View All Result
  • Home
  • AI News & Trends
  • Business & Ethical AI
  • AI Deep Dives & Tutorials
  • AI Literacy & Trust
  • Personal Influence & Brand
  • Institutional Intelligence & Tribal Knowledge

Custom Creative Content Soltions for B2B