IBM launches 4 open-source Granite 4.0 Nano AI models

Serge Bulaev

Serge Bulaev

IBM's new opensource Granite 4.0 Nano AI models bring powerful, efficient language processing directly to consumer devices. Released in October 2025 on GitHub and Hugging Face, these four small models are designed for edge workloads on hardware like smartphones and sensors, eliminating the need for cloud GPUs.

IBM launches 4 open-source Granite 4.0 Nano AI models

IBM's new open-source Granite 4.0 Nano AI models bring powerful, efficient language processing directly to consumer devices. Released in October 2025 on GitHub and Hugging Face, these four small models are designed for edge workloads on hardware like smartphones and sensors, eliminating the need for cloud GPUs.

Why the Granite 4.0 Nano Family Stands Out

The Granite 4.0 Nano family consists of four compact AI models, ranging from 350 million to 1.5 billion parameters. Their key distinction is delivering high performance on local hardware, enabling complex AI tasks like function calling and instruction following on devices without sending data to the cloud.

The family includes four models: two pure transformers and two hybrid Mamba-2/transformer models. Benchmark results for the flagship 1.5B hybrid model show superior performance in its class:

  • IFEval (Instruction Following): 78.5, outscoring Alibaba's Qwen3 1.7B and Google's Gemma 3 1B.
  • Berkeley Function Calling: 54.8, more than triple the score of Gemma 3.

With a memory footprint under 6 GB for the largest model, Granite Nano enables real-time inference on devices like smartphones. IBM also provides enterprise-grade trust with ISO 42001 certification and cryptographic signatures. Internal tests cited in a SiliconANGLE report suggest over 90% cost savings compared to 7B cloud models.

Real-World Applications in Edge and IoT

The ability to run powerful AI locally makes Granite 4.0 Nano ideal for industries where latency, privacy, and connectivity are critical. Early use cases include inspection drones in manufacturing, in-car voice assistants that keep data private, and AR headsets providing offline maintenance instructions. Key benefits of this on-device approach include:

  • Low Latency: Local processing reduces response times to under 50 ms for voice tasks.
  • Energy Efficiency: Power consumption is 60-70% lower than comparable 6B-parameter models.
  • Data Privacy: On-device processing meets strict data residency rules in finance and healthcare.
  • Customization: Open weights under an Apache 2.0 license simplify fine-tuning for specific domains.

Competitive Snapshot

The table below highlights how the flagship Granite 4.0 Nano model compares against other small open-source models on key industry benchmarks.

Model Params IFEval Berkeley FC Safety badge
Granite 4.0 H 1B 1.5 B 78.5 54.8 ISO 42001
Qwen3 1.7 B 73.1 52.2 none
Gemma 3 1 B 59.3 16.3 none

The data confirms Granite's lead in instruction-following and tool-use capabilities for edge devices, a conclusion detailed in the IBM Think analysis.

The Future of On-Device AI

IBM is expanding the Granite ecosystem through collaborations with Qualcomm for NPU optimization and Red Hat for enterprise device management. With weights available for free commercial use under an Apache 2.0 license, developers can download the models from Hugging Face for diverse applications, from browser-based inference with WebGPU to deployment on tinyML stacks. This move democratizes advanced AI, making powerful language reasoning practical outside the data center for the first time - a trend the SiliconANGLE report describes as "cloudless AI in practice."

Serge Bulaev

Written by

Serge Bulaev

Founder & CEO of Creative Content Crafts and creator of Co.Actor — an AI tool that helps employees grow their personal brand and their companies too.