architecture-and-storage
Purpose
This document answers the question: Does Wax use SQLite under the hood? and provides a comprehensive technical overview of Wax’s storage architecture, search capabilities, and performance characteristics.
Executive Summary
Wax does NOT use SQLite as its primary storage engine. However, it does use SQLite FTS5 (Full-Text Search 5) as one component for BM25 full-text searching. The tagline “The SQLite for AI memory” is an analogy about simplicity and portability, not a description of its implementation.
Wax uses a custom binary file format (.mv2s) that combines multiple storage and indexing technologies into a single, self-contained, crash-safe file optimized for on-device AI memory workloads.
What Is Wax?
Repository: github.com/christopherkarani/Wax
Wax is a Swift library that provides a memory layer for on-device AI agents, eliminating the need for Docker, separate vector databases, and network infrastructure. It’s designed for:
- 🤖 AI assistants that remember users across launches
- 📱 Offline-first apps with serious search requirements
- 🔒 Privacy-critical products where data never leaves the device
- 🧪 Research tooling that needs reproducible retrieval
- 🎮 Agent workflows that require durable state
Storage Architecture
The .mv2s File Format
Wax’s custom binary format includes:
- Dual 4KB header pages - Enables atomic updates without corrupting the file
- Write-ahead log (WAL) ring buffer - Provides crash recovery
- Compressed document payloads - Efficient storage of text content
- Compressed embeddings - Vector representations for semantic search
- Table of Contents with footer checksum - Integrity verification
Key Properties:
- Append-only design
- Checksum-verified
- Dual-header for atomicity
- Self-contained (single file)
Hybrid Search Stack
Wax runs parallel search lanes that are fused at query time:
- BM25 via SQLite FTS5 - Full-text search using SQLite’s FTS5 extension
- HNSW vector index - Semantic search via the USearch library
- Temporal evidence lanes - Time-based filtering
- Structured evidence lanes - Metadata-based search
Query-Adaptive Fusion:
- Date-based queries boost temporal signals
- Conceptual queries boost vector + BM25 combination
- The system automatically selects the optimal search strategy
SQLite FTS5 Integration
How SQLite is Used:
- SQLite FTS5 provides the BM25 ranking algorithm for traditional full-text search
- It’s explicitly benchmarked as the baseline (150ms for queries)
- Used as one lane in the hybrid search pipeline, not as the primary storage
What SQLite is NOT used for:
- Not the primary database engine
- Not used for document storage
- Not used for vector indexing
Memory Types
Wax supports multiple memory modalities:
- 📝 Text Memory - Documents, notes, conversations
- 📸 Photo Memory - Photo library with OCR + CLIP embeddings
- 🎬 Video Memory - Video segments with transcripts
Hierarchical Summarization
Wax generates three-tier summaries:
- full - Complete document (for deep dives)
- gist - Key paragraphs (for balanced recall)
- micro - One-liner (for quick context)
At query time, it selects the appropriate tier based on query signals and remaining token budget.
Deterministic Token Budgeting
Uses strict cl100k_base token counting to prevent:
- Context window overflows
- Non-deterministic truncation
- Irreproducible results
This makes RAG behavior testable and benchmarkable.
Metal GPU Acceleration
Vector search performance on Apple Silicon (M1 Pro, 10K docs × 384 dimensions):
| Mode | Latency |
|---|---|
| Metal warm | 0.84ms |
| Metal cold | 9.2ms |
| CPU fallback | 105ms |
| SQLite FTS5 baseline | 150ms |
Key Insight: Metal GPU acceleration provides sub-millisecond vector search, significantly faster than CPU-based approaches and traditional SQLite FTS5 queries.
Performance Benchmarks
Ingest Throughput
| Scale | Time | Rate |
|---|---|---|
| 1,000 docs | 0.309s | ~3,236 docs/s |
| 10,000 docs | 7.756s | ~1,289 docs/s |
Hybrid Search Performance
- Cold Open → First Query: 17ms
- Hybrid Search @ 10K docs: 105ms
Platform Requirements
- Swift: 6.2+
- iOS: 26+ / macOS: 26+
- Hardware: Apple Silicon (for Metal GPU features)
Integration with Swarm
Wax integrates with Christopher Karani’s Swarm agent framework through a SwiftPM trait:
WaxMemoryprovides RAG-backed memory storage- Combines with agent runners and environment injection
- Trait-based compilation ensures minimal overhead when not used
Installation
.package(url: "https://github.com/christopherkarani/Wax.git", from: "0.1.1")Quick Start Example
// Create memorylet brain = try await MemoryOrchestrator(...)
// Store informationtry await brain.remember("User prefers dark mode")
// Retrieve relevant contextlet results = try await brain.recall(query: "What are the user's preferences?")Comparison to Traditional RAG
| Aspect | Traditional RAG | Wax |
|---|---|---|
| Infrastructure | Docker, vector DB, network | Single file |
| Privacy | Data sent to cloud | Stays on device |
| Performance | Network latency | Sub-millisecond GPU search |
| Reliability | Requires DevOps | Self-contained, crash-safe |
| Reproducibility | Non-deterministic | Deterministic token budgets |
Related Projects
- Swarm - Lightweight agent orchestration framework for Swift
- Hive - Deterministic, Swift-native graph runtime for agent workflows
Conclusion
Wax achieves the simplicity and reliability of SQLite (“one file, zero infrastructure”) for AI memory workloads through a custom storage format optimized for hybrid search. While it leverages SQLite FTS5 for one component (BM25 search), its core innovation is the .mv2s format that bundles vector indexing, crash recovery, and GPU acceleration into a single, privacy-preserving file.