Embedding Models#

Learn how to use Termite to generate vector embeddings for semantic search, RAG pipelines, and similarity matching.

Overview#

Termite supports various embedding models for generating vector representations of text and images:

  • Text Embeddings - BAAI/bge, sentence-transformers, and more
  • Multimodal Embeddings - CLIP models for joint text-image embeddings
  • Quantized Models - Faster inference with INT8 quantization

Supported Models#

Text Embedding Models#

ModelDimensionsUse Case
BAAI/bge-small-en-v1.5384General purpose, fast
BAAI/bge-base-en-v1.5768Higher quality
BAAI/bge-large-en-v1.51024Best quality
sentence-transformers/all-MiniLM-L6-v2384Lightweight

Multimodal Models#

ModelDimensionsUse Case
openai/clip-vit-base-patch32512Image-text similarity

Quick Start#

Generate Text Embeddings#

curl -X POST http://localhost:8082/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-small-en-v1.5",
    "input": ["Hello world", "Machine learning is amazing"]
  }'

Generate Image Embeddings#

curl -X POST http://localhost:8082/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/clip-vit-base-patch32",
    "input": [
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
    ]
  }'

Best Practices#

Model Selection#

  1. Start small - Use bge-small-en-v1.5 for prototyping
  2. Scale up - Move to bge-base or bge-large for production
  3. Use quantization - Enable INT8 quantization for faster inference

Batch Processing#

Send multiple texts in a single request for better throughput:

{
  "model": "BAAI/bge-small-en-v1.5",
  "input": [
    "First document...",
    "Second document...",
    "Third document..."
  ]
}

Caching#

Termite automatically caches embeddings for 2 minutes. Identical requests within this window return cached results.

Integration with Antfly#

Use Termite embeddings with Antfly for vector search:

indexes:
  - type: embedding
    config:
      model: BAAI/bge-small-en-v1.5
      termite_url: http://localhost:8082

Next Steps#