research-20260129-075236-issue85

Issue: #85 Date: 2026-01-29T07:52:36-08:00

Summary

Yes, Kimi K2.5 is open source and was released by Moonshot AI in January 2026. It is a 1 trillion parameter mixture-of-experts multimodal model available on Hugging Face. However, running it locally requires substantial hardware resources that exceed what a Mac Studio M3 Ultra can practically handle.

Key Findings

Open Source Status

Yes, fully open source - Released by Moonshot AI and available on Hugging Face
Code and model weights are publicly available
Licensed under open-source terms, enabling local deployment and customization

Model Parameters & Architecture

1 Trillion total parameters with 32 billion activated per token
Mixture-of-Experts (MoE) architecture with 384 experts (8 selected per token)
61 transformer layers with 7,168 attention hidden dimension
Vision encoder: MoonViT with 400M parameters (supports images, video, PDFs)
Context window: 256K tokens
Native INT4 quantization support

System Requirements

Memory Requirements:

Full precision (FP16): ~2TB
Quantized (INT4): ~500GB minimum
Recommended minimum: 240GB unified memory for reasonable performance

Supported Inference Engines:

vLLM
SGLang
KTransformers
MLX (for Apple Silicon)

Mac Studio M3 Ultra Performance Reality

Critical Limitation: Mac Studio M3 Ultra with 256GB memory is insufficient for optimal performance:

Max unified memory: 192GB (below the 240GB recommended)
Expected performance: ~21 tokens/second (very slow) if you could run it
Practical requirement: Would need 2× Mac Studio M3 Ultra systems clustered together (512GB total)
Single M3 Ultra: Cannot adequately run the full model due to memory bottleneck

Practical Recommendation

For Mac Studio M3 Ultra users, the viable options are:

Use the API - Access via Moonshot’s platform at $0.60/M input tokens (much more practical)
Cluster multiple Macs - Requires 2+ Mac Studio systems with external high-bandwidth interconnect
Use quantized versions - MLX-optimized INT4 quantizations available, but still challenging on single 256GB system