Embedding Models#

Learn how to use Termite to generate vector embeddings for semantic search, RAG pipelines, and similarity matching.

Overview#

Termite supports various embedding models for generating vector representations of text and images:

Text Embeddings - BAAI/bge, sentence-transformers, and more
Multimodal Embeddings - CLIP models for joint text-image embeddings
Quantized Models - Faster inference with INT8 quantization

Supported Models#

Text Embedding Models#

Model	Dimensions	Use Case
`BAAI/bge-small-en-v1.5`	384	General purpose, fast
`BAAI/bge-base-en-v1.5`	768	Higher quality
`BAAI/bge-large-en-v1.5`	1024	Best quality
`sentence-transformers/all-MiniLM-L6-v2`	384	Lightweight

Multimodal Models#

Model	Dimensions	Use Case
`openai/clip-vit-base-patch32`	512	Image-text similarity

Quick Start#

Generate Text Embeddings#

curl -X POST http://localhost:8082/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-small-en-v1.5",
    "input": ["Hello world", "Machine learning is amazing"]
  }'

Generate Image Embeddings#

curl -X POST http://localhost:8082/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/clip-vit-base-patch32",
    "input": [
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
    ]
  }'

Best Practices#

Model Selection#

Start small - Use bge-small-en-v1.5 for prototyping
Scale up - Move to bge-base or bge-large for production
Use quantization - Enable INT8 quantization for faster inference

Batch Processing#

Send multiple texts in a single request for better throughput:

{
  "model": "BAAI/bge-small-en-v1.5",
  "input": [
    "First document...",
    "Second document...",
    "Third document..."
  ]
}

Caching#

Termite automatically caches embeddings for 2 minutes. Identical requests within this window return cached results.

Integration with Antfly#

Use Termite embeddings with Antfly for vector search:

indexes:
  - type: embedding
    config:
      model: BAAI/bge-small-en-v1.5
      termite_url: http://localhost:8082

Next Steps#

API Reference - Embedding API details
Reranking - Improve search relevance
Models - Browse all available models