architecture-and-storage

Purpose

This document answers the question: Does Wax use SQLite under the hood? and provides a comprehensive technical overview of Wax’s storage architecture, search capabilities, and performance characteristics.

Executive Summary

Wax does NOT use SQLite as its primary storage engine. However, it does use SQLite FTS5 (Full-Text Search 5) as one component for BM25 full-text searching. The tagline “The SQLite for AI memory” is an analogy about simplicity and portability, not a description of its implementation.

Wax uses a custom binary file format (.mv2s) that combines multiple storage and indexing technologies into a single, self-contained, crash-safe file optimized for on-device AI memory workloads.

What Is Wax?

Repository: github.com/christopherkarani/Wax

Wax is a Swift library that provides a memory layer for on-device AI agents, eliminating the need for Docker, separate vector databases, and network infrastructure. It’s designed for:

🤖 AI assistants that remember users across launches
📱 Offline-first apps with serious search requirements
🔒 Privacy-critical products where data never leaves the device
🧪 Research tooling that needs reproducible retrieval
🎮 Agent workflows that require durable state

Storage Architecture

The .mv2s File Format

Wax’s custom binary format includes:

Dual 4KB header pages - Enables atomic updates without corrupting the file
Write-ahead log (WAL) ring buffer - Provides crash recovery
Compressed document payloads - Efficient storage of text content
Compressed embeddings - Vector representations for semantic search
Table of Contents with footer checksum - Integrity verification

Key Properties:

Append-only design
Checksum-verified
Dual-header for atomicity
Self-contained (single file)

Hybrid Search Stack

Wax runs parallel search lanes that are fused at query time:

BM25 via SQLite FTS5 - Full-text search using SQLite’s FTS5 extension
HNSW vector index - Semantic search via the USearch library
Temporal evidence lanes - Time-based filtering
Structured evidence lanes - Metadata-based search

Query-Adaptive Fusion:

Date-based queries boost temporal signals
Conceptual queries boost vector + BM25 combination
The system automatically selects the optimal search strategy

SQLite FTS5 Integration

How SQLite is Used:

SQLite FTS5 provides the BM25 ranking algorithm for traditional full-text search
It’s explicitly benchmarked as the baseline (150ms for queries)
Used as one lane in the hybrid search pipeline, not as the primary storage

What SQLite is NOT used for:

Not the primary database engine
Not used for document storage
Not used for vector indexing

Memory Types

Wax supports multiple memory modalities:

📝 Text Memory - Documents, notes, conversations
📸 Photo Memory - Photo library with OCR + CLIP embeddings
🎬 Video Memory - Video segments with transcripts

Hierarchical Summarization

Wax generates three-tier summaries:

full - Complete document (for deep dives)
gist - Key paragraphs (for balanced recall)
micro - One-liner (for quick context)

At query time, it selects the appropriate tier based on query signals and remaining token budget.

Deterministic Token Budgeting

Uses strict cl100k_base token counting to prevent:

Context window overflows
Non-deterministic truncation
Irreproducible results

This makes RAG behavior testable and benchmarkable.

Metal GPU Acceleration

Vector search performance on Apple Silicon (M1 Pro, 10K docs × 384 dimensions):

Mode	Latency
Metal warm	0.84ms
Metal cold	9.2ms
CPU fallback	105ms
SQLite FTS5 baseline	150ms

Key Insight: Metal GPU acceleration provides sub-millisecond vector search, significantly faster than CPU-based approaches and traditional SQLite FTS5 queries.

Performance Benchmarks

Ingest Throughput

Scale	Time	Rate
1,000 docs	0.309s	~3,236 docs/s
10,000 docs	7.756s	~1,289 docs/s

Hybrid Search Performance

Cold Open → First Query: 17ms
Hybrid Search @ 10K docs: 105ms

Platform Requirements

Swift: 6.2+
iOS: 26+ / macOS: 26+
Hardware: Apple Silicon (for Metal GPU features)

Integration with Swarm

Wax integrates with Christopher Karani’s Swarm agent framework through a SwiftPM trait:

WaxMemory provides RAG-backed memory storage
Combines with agent runners and environment injection
Trait-based compilation ensures minimal overhead when not used

Installation

.package(url: "https://github.com/christopherkarani/Wax.git", from: "0.1.1")

Quick Start Example

// Create memory
let brain = try await MemoryOrchestrator(...)

// Store information
try await brain.remember("User prefers dark mode")

// Retrieve relevant context
let results = try await brain.recall(query: "What are the user's preferences?")

Comparison to Traditional RAG

Aspect	Traditional RAG	Wax
Infrastructure	Docker, vector DB, network	Single file
Privacy	Data sent to cloud	Stays on device
Performance	Network latency	Sub-millisecond GPU search
Reliability	Requires DevOps	Self-contained, crash-safe
Reproducibility	Non-deterministic	Deterministic token budgets

Swarm - Lightweight agent orchestration framework for Swift
Hive - Deterministic, Swift-native graph runtime for agent workflows

Conclusion

Wax achieves the simplicity and reliability of SQLite (“one file, zero infrastructure”) for AI memory workloads through a custom storage format optimized for hybrid search. While it leverages SQLite FTS5 for one component (BM25 search), its core innovation is the .mv2s format that bundles vector indexing, crash recovery, and GPU acceleration into a single, privacy-preserving file.