Common questions about this section
  • What is Termite?
  • How do I download models for Termite?
  • What models are available?
  • How do I use Termite for embeddings, chunking, and reranking?

Termite is Antfly's local ML inference service for embeddings, chunking, and reranking. It runs ONNX-optimized models for fast CPU inference without external API dependencies.

Features#

  • Embedding Generation - Text and multimodal (CLIP) embedding models
  • Text Chunking - Semantic chunking with ONNX models or fixed-size fallback
  • Reranking - Relevance re-scoring for search results
  • Privacy - Data never leaves your infrastructure

Model Management#

List Available Models#

View all available models in the registry:

antfly termite list --remote

Download Models#

Download quantized ONNX models (INT8 optimized for fast CPU inference):

antfly termite pull --variants i8 bge-small-en-v1.5 chonky-mmbert-small-multilingual-1 mxbai-rerank-base-v1

Available model types:

  • Embedders: bge-small-en-v1.5, mxbai-embed-large-v1, clip-vit-base-patch32
  • Chunkers: chonky-mmbert-small-multilingual-1
  • Rerankers: mxbai-rerank-base-v1

List Local Models#

View models you've downloaded:

antfly termite list

Model Variants#

Models support different quantization variants:

VariantDescription
f32Full precision (largest, most accurate)
i8INT8 quantized (smaller, faster, recommended)

Use --variants to specify which variant to download:

antfly termite pull --variants i8,f32 bge-small-en-v1.5

Model Storage#

Models are stored in the following directories:

  • ./models/embedders/ - Embedding models
  • ./models/chunkers/ - Chunking models
  • ./models/rerankers/ - Reranking models

Termite auto-discovers and loads models from these directories when Antfly starts in swarm mode.

Configuration#

Termite models can be used in index and query configurations:

Embedder Configuration#

{
  "provider": "termite",
  "model": "bge-small-en-v1.5"
}

Chunker Configuration#

{
  "provider": "termite",
  "model": "chonky-mmbert-small-multilingual-1",
  "target_tokens": 512,
  "overlap_tokens": 50
}

Reranker Configuration#

{
  "provider": "termite",
  "model": "mxbai-rerank-base-v1",
  "field": "body"
}

API Reference#

See the Termite API documentation for details on the HTTP endpoints.