Purpose

This document answers the question: Does Wax use SQLite under the hood? and provides a comprehensive technical overview of Wax’s storage architecture, search capabilities, and performance characteristics.

Executive Summary

Wax does NOT use SQLite as its primary storage engine. However, it does use SQLite FTS5 (Full-Text Search 5) as one component for BM25 full-text searching. The tagline “The SQLite for AI memory” is an analogy about simplicity and portability, not a description of its implementation.

Wax uses a custom binary file format (.mv2s) that combines multiple storage and indexing technologies into a single, self-contained, crash-safe file optimized for on-device AI memory workloads.

What Is Wax?

Repository: github.com/christopherkarani/Wax

Wax is a Swift library that provides a memory layer for on-device AI agents, eliminating the need for Docker, separate vector databases, and network infrastructure. It’s designed for:

  • 🤖 AI assistants that remember users across launches
  • 📱 Offline-first apps with serious search requirements
  • 🔒 Privacy-critical products where data never leaves the device
  • 🧪 Research tooling that needs reproducible retrieval
  • 🎮 Agent workflows that require durable state

Storage Architecture

The .mv2s File Format

Wax’s custom binary format includes:

  • Dual 4KB header pages - Enables atomic updates without corrupting the file
  • Write-ahead log (WAL) ring buffer - Provides crash recovery
  • Compressed document payloads - Efficient storage of text content
  • Compressed embeddings - Vector representations for semantic search
  • Table of Contents with footer checksum - Integrity verification

Key Properties:

  • Append-only design
  • Checksum-verified
  • Dual-header for atomicity
  • Self-contained (single file)

Hybrid Search Stack

Wax runs parallel search lanes that are fused at query time:

  1. BM25 via SQLite FTS5 - Full-text search using SQLite’s FTS5 extension
  2. HNSW vector index - Semantic search via the USearch library
  3. Temporal evidence lanes - Time-based filtering
  4. Structured evidence lanes - Metadata-based search

Query-Adaptive Fusion:

  • Date-based queries boost temporal signals
  • Conceptual queries boost vector + BM25 combination
  • The system automatically selects the optimal search strategy

SQLite FTS5 Integration

How SQLite is Used:

  • SQLite FTS5 provides the BM25 ranking algorithm for traditional full-text search
  • It’s explicitly benchmarked as the baseline (150ms for queries)
  • Used as one lane in the hybrid search pipeline, not as the primary storage

What SQLite is NOT used for:

  • Not the primary database engine
  • Not used for document storage
  • Not used for vector indexing

Memory Types

Wax supports multiple memory modalities:

  • 📝 Text Memory - Documents, notes, conversations
  • 📸 Photo Memory - Photo library with OCR + CLIP embeddings
  • 🎬 Video Memory - Video segments with transcripts

Hierarchical Summarization

Wax generates three-tier summaries:

  • full - Complete document (for deep dives)
  • gist - Key paragraphs (for balanced recall)
  • micro - One-liner (for quick context)

At query time, it selects the appropriate tier based on query signals and remaining token budget.

Deterministic Token Budgeting

Uses strict cl100k_base token counting to prevent:

  • Context window overflows
  • Non-deterministic truncation
  • Irreproducible results

This makes RAG behavior testable and benchmarkable.

Metal GPU Acceleration

Vector search performance on Apple Silicon (M1 Pro, 10K docs × 384 dimensions):

ModeLatency
Metal warm0.84ms
Metal cold9.2ms
CPU fallback105ms
SQLite FTS5 baseline150ms

Key Insight: Metal GPU acceleration provides sub-millisecond vector search, significantly faster than CPU-based approaches and traditional SQLite FTS5 queries.

Performance Benchmarks

Ingest Throughput

ScaleTimeRate
1,000 docs0.309s~3,236 docs/s
10,000 docs7.756s~1,289 docs/s

Hybrid Search Performance

  • Cold Open → First Query: 17ms
  • Hybrid Search @ 10K docs: 105ms

Platform Requirements

  • Swift: 6.2+
  • iOS: 26+ / macOS: 26+
  • Hardware: Apple Silicon (for Metal GPU features)

Integration with Swarm

Wax integrates with Christopher Karani’s Swarm agent framework through a SwiftPM trait:

  • WaxMemory provides RAG-backed memory storage
  • Combines with agent runners and environment injection
  • Trait-based compilation ensures minimal overhead when not used

Installation

.package(url: "https://github.com/christopherkarani/Wax.git", from: "0.1.1")

Quick Start Example

// Create memory
let brain = try await MemoryOrchestrator(...)
// Store information
try await brain.remember("User prefers dark mode")
// Retrieve relevant context
let results = try await brain.recall(query: "What are the user's preferences?")

Comparison to Traditional RAG

AspectTraditional RAGWax
InfrastructureDocker, vector DB, networkSingle file
PrivacyData sent to cloudStays on device
PerformanceNetwork latencySub-millisecond GPU search
ReliabilityRequires DevOpsSelf-contained, crash-safe
ReproducibilityNon-deterministicDeterministic token budgets
  • Swarm - Lightweight agent orchestration framework for Swift
  • Hive - Deterministic, Swift-native graph runtime for agent workflows

Conclusion

Wax achieves the simplicity and reliability of SQLite (“one file, zero infrastructure”) for AI memory workloads through a custom storage format optimized for hybrid search. While it leverages SQLite FTS5 for one component (BM25 search), its core innovation is the .mv2s format that bundles vector indexing, crash recovery, and GPU acceleration into a single, privacy-preserving file.

Sources

  1. Wax GitHub Repository
  2. Swarm Agent Framework
  3. Hive Package Index