Embedding Models
Learn how to use Termite to generate vector embeddings for semantic search, RAG pipelines, and similarity matching.
Overview
Termite supports various embedding models for generating vector representations of text and images:
- Text Embeddings - BAAI/bge, sentence-transformers, and more
- Multimodal Embeddings - CLIP models for joint text-image embeddings
- Quantized Models - Faster inference with INT8 quantization
Supported Models
Text Embedding Models
| Model | Dimensions | Use Case |
|---|---|---|
BAAI/bge-small-en-v1.5 | 384 | General purpose, fast |
BAAI/bge-base-en-v1.5 | 768 | Higher quality |
BAAI/bge-large-en-v1.5 | 1024 | Best quality |
sentence-transformers/all-MiniLM-L6-v2 | 384 | Lightweight |
Multimodal Models
| Model | Dimensions | Use Case |
|---|---|---|
openai/clip-vit-base-patch32 | 512 | Image-text similarity |
Quick Start
Generate Text Embeddings
curl -X POST http://localhost:8082/api/embed \
-H "Content-Type: application/json" \
-d '{
"model": "BAAI/bge-small-en-v1.5",
"input": ["Hello world", "Machine learning is amazing"]
}'Generate Image Embeddings
curl -X POST http://localhost:8082/api/embed \
-H "Content-Type: application/json" \
-d '{
"model": "openai/clip-vit-base-patch32",
"input": [
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]
}'Best Practices
Model Selection
- Start small - Use
bge-small-en-v1.5for prototyping - Scale up - Move to
bge-baseorbge-largefor production - Use quantization - Enable INT8 quantization for faster inference
Batch Processing
Send multiple texts in a single request for better throughput:
{
"model": "BAAI/bge-small-en-v1.5",
"input": [
"First document...",
"Second document...",
"Third document..."
]
}Caching
Termite automatically caches embeddings for 2 minutes. Identical requests within this window return cached results.
Integration with Antfly
Use Termite embeddings with Antfly for vector search:
indexes:
- type: embedding
config:
model: BAAI/bge-small-en-v1.5
termite_url: http://localhost:8082Next Steps
- API Reference - Embedding API details
- Reranking - Improve search relevance
- Models - Browse all available models