Index Management
Create and manage indexes for full-text search, vector similarity search, and multimodal content.
Index Types
Antfly supports two primary index types:
full_text_v0 - Full-Text Search Index
BM25-based full-text search using Bleve. Automatically uses schema information for field mappings.
Configuration:
{
"type": "full_text_v0",
"name": "search_idx",
"mem_only": false
}Options:
mem_only(optional): If true, stores index in memory only (useful for testing)
Features:
- BM25 ranking algorithm
- Field-specific queries (
title:computer) - Boolean operators (
AND,OR,NOT) - Range queries (
year:>2020) - Phrase queries (
"exact phrase") - Uses schema's
x-antfly-typesfor field analyzers - Respects
x-antfly-include-in-allfor cross-field search
aknn_v0 - Vector Similarity Index
Approximate k-nearest neighbors (AKNN) index for semantic similarity search using vector embeddings.
Performance Features:
- Hardware-accelerated distance calculations using SIMD instructions
- x86/AMD: AVX2 and AVX-512 optimizations for vector operations
- ARM: NEON and SME (Scalable Matrix Extension) acceleration
- RaBitQ quantization for reduced memory footprint and faster search
- No GPU required - optimized for CPU-based deployments
Basic Configuration:
{
"type": "aknn_v0",
"name": "embedding_idx",
"dimension": 384,
"embedder": {
"provider": "ollama",
"model": "all-minilm",
"url": "http://localhost:11434"
}
}Required Fields:
embedder: Embedder configuration for generating vectors
Optional Fields:
dimension: Vector dimension (inferred from embedder if not specified)field: Field to embed using JSONPath syntax (e.g.,"description","metadata.content","$.user.bio")template: Handlebars template for custom prompt generationsummarizer: Vision/multimodal model for processing images/audio/videomem_only: Store index in memory only
Embedder Providers
Antfly supports multiple embedding providers via the embedder configuration:
Ollama (Local Models)
{
"provider": "ollama",
"model": "all-minilm",
"url": "http://localhost:11434"
}Popular models: all-minilm, nomic-embed-text, mxbai-embed-large
OpenAI
{
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-..."
}Models: text-embedding-3-small (1536d), text-embedding-3-large (3072d), text-embedding-ada-002 (1536d)
Google Gemini
{
"provider": "gemini",
"model": "text-embedding-004",
"project_id": "my-project",
"location": "us-central1",
"dimension": 768
}Supports multimodal embeddings (text + images).
AWS Bedrock
{
"provider": "bedrock",
"model": "amazon.titan-embed-text-v1",
"region": "us-east-1"
}Models: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2, cohere.embed-english-v3, cohere.embed-multilingual-v3
Multimodal Indexing
Vector indexes can process images, audio, and video using vision/multimodal models.
With Summarizer (Two-Stage)
Use a vision model to generate text descriptions, then embed the descriptions:
{
"type": "aknn_v0",
"dimension": 384,
"embedder": {
"provider": "ollama",
"model": "all-minilm"
},
"summarizer": {
"provider": "ollama",
"model": "llava"
}
}How it works: Vision model (llava) generates text description → Embedder (all-minilm) creates vector
Supported Vision Models:
- Ollama:
llava,llava-phi3,bakllava,llava-llama3 - OpenAI:
gpt-4o,gpt-4-turbo,gpt-4o-mini - Gemini:
gemini-2.0-flash-exp,gemini-1.5-pro,gemini-1.5-flash - Anthropic:
claude-3-5-sonnet-20241022,claude-3-opus-20240229 - Bedrock:
anthropic.claude-3-5-sonnet-20241022-v2:0
Native Multimodal (Single-Stage)
Some models support direct multimodal embeddings:
{
"type": "aknn_v0",
"dimension": 1024,
"embedder": {
"provider": "gemini",
"model": "text-embedding-004",
"project_id": "my-project"
}
}Gemini's embedding models can directly process text and images without a separate vision model.
Supported Media Formats
When using multimodal indexes, documents can reference media in multiple ways:
Images
- Base64 encoded (data URI):
data:image/jpeg;base64,<base64>,data:image/png;base64,<base64>,data:image/webp;base64,<base64> - URLs:
http://,https://,s3://,file://
Audio & Video
- URLs:
http://,https://,s3://,file://
Example document with image:
{
"title": "Product Photo",
"image": "data:image/jpeg;base64,iVBORw0KGgoAAAANSUhEUgA..."
}Template-Based Prompts
Customize how documents are converted to text before embedding using Handlebars templates:
{
"type": "aknn_v0",
"dimension": 384,
"field": "description",
"template": "Product: {{title}}\nCategory: {{category}}\nDescription: {{description}}",
"embedder": {
"provider": "ollama",
"model": "all-minilm"
}
}Template System:
- Syntax: Handlebars templating (https://handlebarsjs.com/guide/)
- Caching: Templates are automatically cached with configurable TTL (default: 5 minutes)
- Context: Templates receive the full document as context
Basic Syntax:
{{fieldName}}: Insert field value{{#if field}}...{{/if}}: Conditional logic{{#each array}}...{{/each}}: Iterate arrays{{#unless condition}}...{{/unless}}: Inverse conditional{{@index}},{{@key}},{{@first}},{{@last}}: Loop variables
Built-in Helpers:
-
scrubHtml - Remove script/style tags and extract clean text from HTML
- Removes
<script>and<style>tags - Adds newlines after block elements (p, div, h1-h6, li, etc.)
- Returns plain text with preserved readability
- Removes
-
eq - Equality comparison for conditionals
Active Special field -
media - GenKit dotprompt media directive for multimodal content
Supported URL Schemes:
data:- Base64 encoded data URIs (e.g.,data:image/jpeg;base64,...)http:///https://- Web URLs with automatic content type detectionfile://- Local filesystem pathss3://- S3-compatible storage (format:s3://endpoint/bucket/key)
Automatic Content Processing:
- Images: Downloaded, resized (if needed), converted to data URIs
- PDFs: Text extracted or first page rendered as image
- HTML: Readable text extracted using Mozilla Readability
Security Controls: Downloads are protected by content security settings:
- Allowed host whitelist
- Private IP blocking (prevents SSRF attacks)
- Download size limits (default: 100MB)
- Download timeouts (default: 30s)
- Image dimension limits (default: 2048px, auto-resized)
-
encodeToon - Encode data in TOON format (Token-Oriented Object Notation)
TOON Format provides 30-60% token reduction compared to JSON:
- Compact syntax using
:for key-value pairs - Array length markers:
tags[#3]: ai,search,ml - Tabular format for uniform data structures
- Optimized for LLM parsing
Options:
lengthMarker(bool): Add # prefix to array counts (default: true)indent(int): Indentation spacing (default: 2)delimiter(string): Field separator for tabular arrays (use"\t"for tabs)
Example output:
title: Introduction to Vector Search author: Jane Doe tags[#3]: ai,search,mlReferences:
- TOON Specification: https://github.com/toon-format/toon
- Go Implementation: https://github.com/alpkeskin/gotoon
- Compact syntax using
Without a template, the field value is used directly, or the entire document is serialized.
Field Extraction with JSONPath
The field parameter supports JSONPath syntax for extracting values from nested document structures:
{
"type": "aknn_v0",
"field": "metadata.description",
"embedder": {"provider": "ollama", "model": "all-minilm"}
}JSONPath Examples:
"title"- Top-level field"user.bio"- Nested field (auto-prepends$.if not present)"$.metadata.content"- Explicit JSONPath"$.items[0].name"- Array indexing
The extracted value must be a string. Use template for complex transformations or combining multiple fields.
Configuration Examples
Simple Text Embedding
{
"type": "aknn_v0",
"field": "content",
"embedder": {
"provider": "ollama",
"model": "all-minilm"
}
}Note: dimension is omitted and will be inferred from the all-minilm model (384).
Multimodal with Custom Template
{
"type": "aknn_v0",
"dimension": 768,
"template": "{{title}}: {{description}}",
"embedder": {
"provider": "gemini",
"model": "text-embedding-004",
"dimension": 768
},
"summarizer": {
"provider": "gemini",
"model": "gemini-1.5-flash"
}
}Production OpenAI Setup
{
"type": "aknn_v0",
"dimension": 1536,
"field": "body",
"embedder": {
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "${OPENAI_API_KEY}"
}
}Best Practices
- Choose dimension wisely: Higher dimensions improve accuracy but increase storage/compute costs
- Use templates for structured data: Combine multiple fields into meaningful prompts
- Multimodal strategy: Use two-stage (summarizer + embedder) for flexibility, single-stage for simplicity
- Local vs Cloud: Ollama for development/privacy, cloud providers for production/scale
- Field specification: Specify
fieldwhen embedding a single field to avoid whole-document serialization
Implementing Autocomplete Search
To implement autocomplete/search-as-you-type functionality:
1. Define schema with search_as_you_type type:
{
"document_schemas": {
"product": {
"schema": {
"type": "object",
"properties": {
"name": {
"type": "string",
"x-antfly-types": ["text", "search_as_you_type"]
}
}
}
}
}
}2. Create full-text index:
{
"indexes": {
"search_idx": {
"type": "full_text_v0"
}
}
}3. Query the __2gram field:
{
"full_text_search": {
"query": "name__2gram:lap"
}
}The search_as_you_type type creates an edge n-gram indexed field (suffixed with __2gram)
that matches partial input as users type. For example, "lap" matches "laptop", "lapel", etc.
Reindexing Data
What happens when you need to reindex? Antfly handles reindexing through async enrichment, automatically rebuilding indexes in the background. Here's what to expect when reindexing data (after adding a new index or changing configuration):
Adding a New Index to Existing Table
Antfly handles reindexing automatically through async enrichment:
1. Add the index:
POST /tables/{tableName}/indexes/{indexName}
{
"type": "aknn_v0",
"embedder": {...}
}2. Enrichment happens automatically:
- Existing documents are enriched in the background
- New writes are enriched immediately
- Leader-only processing prevents duplicate work
- Check index status to monitor progress
3. Monitor progress:
GET /tables/{tableName}/indexes/{indexName}Look at shard_status for enrichment progress per shard.
Error Handling for Async Enrichment
When embedding generation or enrichment fails, Antfly handles errors gracefully:
Retry Behavior:
- Failed enrichments are automatically retried with exponential backoff
- Leader-only processing ensures no duplicate work during retries
- Enrichment continues for other documents even if some fail
Monitoring Failures:
- Check index status (
GET /tables/{table}/indexes/{index}) for error counts - Prometheus metrics expose enrichment failure rates
- Failed documents remain searchable via full-text index (if configured)
Common Failure Scenarios:
-
Embedding API Unavailable
- Enrichment retries automatically when service recovers
- Documents remain indexed (without embeddings) for full-text search
- Use
sync_level: enrichmentsfor synchronous embedding (fails write if embedding fails)
-
Rate Limiting
- Automatic backoff and retry
- Consider using local Ollama to avoid rate limits
- Or increase API quota with embedding provider
-
Model Errors (invalid input, size limits)
- Logged with document ID for investigation
- Retries stop after max attempts
- Document remains in table without enrichment
Fallback Strategy:
For critical workloads, use the failover merge strategy to fall back to full-text search
when embedding generation fails:
{
"full_text_search": {"query": "search term"},
"semantic_search": "search term",
"merge_config": {"strategy": "failover"}
}This ensures queries always return results even if embeddings are unavailable.
Changing Index Configuration
To change index configuration:
- Drop the existing index:
DELETE /tables/{table}/indexes/{index} - Recreate with new config:
POST /tables/{table}/indexes/{index} - Existing data is automatically re-enriched
Note: Full-text indexes (full_text_v0) reindex immediately on creation and automatically
rebuild when schema changes are made. Vector indexes (aknn_v0) enrich asynchronously in the background.
Backup Before Major Reindexing
For large datasets, backup before reindexing:
POST /tables/{tableName}/backup
{
"backup_id": "before-reindex-2025-01",
"location": "s3://mybucket/backups/products"
}Graph Indexes and Edge TTL
Graph indexes (graph_v0) enable relationship modeling and traversal queries. Like documents, graph edges
support automatic expiration through per-index TTL configuration.
Graph Index Overview
Graph indexes store directed edges between documents, enabling queries like:
- Finding all connections from a document
- Traversing multi-hop paths
- Filtering relationships by type
Basic graph index:
{
"type": "graph_v0",
"name": "social_graph"
}Edge TTL Configuration
Configure automatic edge expiration per graph index using ttl_duration:
{
"type": "graph_v0",
"name": "recent_interactions",
"ttl_duration": "7d"
}How it works:
- Edges expire after the specified duration from their creation timestamp
- Background cleanup runs every 30 seconds on the Raft leader
- Each edge has a timestamp key (
:tsuffix) tracking when it was created - Expired edges are filtered from traversal queries immediately
- Deletions are batched (1000 edges at a time) and go through Raft consensus
Edge TTL vs Document TTL
Document TTL (table-level):
- Configured via
ttl_durationin table schema - Applies to all documents in the table
- Uses
_timestampor customttl_field
Edge TTL (index-level):
- Configured via
ttl_durationin graph index config - Applies only to edges in that specific graph index
- Uses edge creation timestamp (stored in
:tkey) - Different graph indexes can have different TTL durations
Duration Format
Edge TTL uses the same Go duration format as document TTL:
30s- 30 seconds5m- 5 minutes24h- 24 hours7d- 7 days (treated as 168h)
Use Cases
Session Graphs:
{
"type": "graph_v0",
"name": "active_sessions",
"ttl_duration": "1h"
}Track active user sessions; edges expire after 1 hour of inactivity.
Recent Activity Feed:
{
"type": "graph_v0",
"name": "recent_interactions",
"ttl_duration": "7d"
}Keep only the last week of user interactions; older edges automatically deleted.
Temporary Recommendations:
{
"type": "graph_v0",
"name": "trending_connections",
"ttl_duration": "24h"
}Track trending content connections; refresh daily by letting old edges expire.
Rate Limiting:
{
"type": "graph_v0",
"name": "api_calls",
"ttl_duration": "1h"
}Track API calls per user; edges expire after 1 hour for sliding window rate limiting.
Performance Characteristics
Edge TTL cleanup uses the same optimized approach as document TTL:
Optimized Scanning:
- Only scans edge timestamp keys (
:tsuffix) - No document deserialization required
- O(1) per edge - just reads a timestamp
- ~100-1000x faster than scanning full edge data
Query Filtering:
- Single key lookup to check edge expiration
- Minimal latency impact on traversal queries
- Scales to millions of edges
Cleanup Behavior:
- Runs every 30 seconds (same interval as document TTL)
- Leader-only operation
- 5-second grace period for replication
- Batched deletions (1000 edges per batch)
Monitoring
Edge TTL operations are logged separately from document TTL:
INFO Starting edge TTL cleanup job indexes_with_ttl=3 cleanup_interval=30s
INFO Cleaned up expired edges count=127 duration=89ms total_expired=5432Metrics tracked per graph index:
- Total edges expired since leader became active
- Last cleanup duration
- Scanned vs expired edge counts
Modifying Edge TTL
Adding TTL to existing graph index:
- Currently requires dropping and recreating the index
- Edges are not preserved when index is dropped
- Plan accordingly for production deployments
Removing TTL:
- Drop and recreate index without
ttl_duration - All edges become permanent
Changing duration:
- Drop and recreate index with new
ttl_duration - Existing edges are not preserved
Limitations
- Per-index configuration: Each graph index has its own TTL setting
- Edge-level timestamps: Uses edge creation time (not document timestamps)
- No per-edge TTL: All edges in a graph index share the same TTL duration
- Index recreation required: Cannot modify TTL on existing graph index
- Clock synchronization: Requires NTP for accurate expiration across cluster nodes
Best Practices
Separate indexes for different TTLs:
{
"indexes": {
"permanent_relationships": {
"type": "graph_v0"
},
"recent_activity": {
"type": "graph_v0",
"ttl_duration": "7d"
},
"realtime_events": {
"type": "graph_v0",
"ttl_duration": "1h"
}
}
}Use multiple graph indexes when you need different expiration policies for different types of relationships.
Monitor cleanup logs:
- Watch for high edge counts requiring cleanup
- Adjust TTL duration if cleanup becomes a bottleneck
- Consider shorter cleanup intervals for time-sensitive use cases
Test before production:
- Verify edge expiration behavior matches expectations
- Test cleanup performance with representative edge counts
- Ensure traversal queries filter expired edges correctly
- What's the difference between full_text_v0 and aknn_v0 indexes?
- Which embedding provider should I use?
- How do I index images and other media?
- Can I add indexes after the table is created?
List all indexes for a table
/tables/{tableName}/indexesSecurity
Provide your bearer token in the Authorization header when making requests to protected resources.
Example: Authorization: Bearer YOUR_API_KEY
Code Examples
curl -X GET "/api/v1/tables/{tableName}/indexes" \
-H "Authorization: Bearer YOUR_API_KEY"const response = await fetch('/api/v1/tables/{tableName}/indexes', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const data = await response.json();fetch('/api/v1/tables/{tableName}/indexes', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
})
.then(response => response.json())
.then(data => console.log(data));import requests
headers = {
'Authorization': 'Bearer YOUR_API_KEY'
}
response = requests.get('/api/v1/tables/{tableName}/indexes', headers=headers)
data = response.json()package main
import (
"bytes"
"encoding/json"
"net/http"
)
func main() {
req, _ := http.NewRequest("GET", "/api/v1/tables/{tableName}/indexes", nil)
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
client := &http.Client{}
resp, _ := client.Do(req)
defer resp.Body.Close()
}Responses
[
{
"shard_status": {},
"config": {
"mem_only": true
},
"status": {
"error": "string",
"total_indexed": 0,
"disk_usage": 0,
"rebuilding": true
}
}
]Add an index to a table
/tables/{tableName}/indexes/{indexName}Security
Provide your bearer token in the Authorization header when making requests to protected resources.
Example: Authorization: Bearer YOUR_API_KEY
Request Body
Example:
{
"mem_only": true
}Code Examples
curl -X POST "/api/v1/tables/{tableName}/indexes/{indexName}" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"mem_only": true
}'const response = await fetch('/api/v1/tables/{tableName}/indexes/{indexName}', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
"mem_only": true
})
});
const data = await response.json();fetch('/api/v1/tables/{tableName}/indexes/{indexName}', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
"mem_only": true
})
})
.then(response => response.json())
.then(data => console.log(data));import requests
headers = {
'Authorization': 'Bearer YOUR_API_KEY'
}
response = requests.post(
'/api/v1/tables/{tableName}/indexes/{indexName}',
headers=headers,
json={
"mem_only": true
}
)
data = response.json()package main
import (
"bytes"
"encoding/json"
"net/http"
)
func main() {
body := []byte(`{
"mem_only": true
}`)
req, _ := http.NewRequest("POST", "/api/v1/tables/{tableName}/indexes/{indexName}", bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")
client := &http.Client{}
resp, _ := client.Do(req)
defer resp.Body.Close()
}Responses
Drop an index from a table
/tables/{tableName}/indexes/{indexName}Security
Provide your bearer token in the Authorization header when making requests to protected resources.
Example: Authorization: Bearer YOUR_API_KEY
Code Examples
curl -X DELETE "/api/v1/tables/{tableName}/indexes/{indexName}" \
-H "Authorization: Bearer YOUR_API_KEY"const response = await fetch('/api/v1/tables/{tableName}/indexes/{indexName}', {
method: 'DELETE',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const data = await response.json();fetch('/api/v1/tables/{tableName}/indexes/{indexName}', {
method: 'DELETE',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
})
.then(response => response.json())
.then(data => console.log(data));import requests
headers = {
'Authorization': 'Bearer YOUR_API_KEY'
}
response = requests.delete('/api/v1/tables/{tableName}/indexes/{indexName}', headers=headers)
data = response.json()package main
import (
"bytes"
"encoding/json"
"net/http"
)
func main() {
req, _ := http.NewRequest("DELETE", "/api/v1/tables/{tableName}/indexes/{indexName}", nil)
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
client := &http.Client{}
resp, _ := client.Do(req)
defer resp.Body.Close()
}Responses
Get index details
/tables/{tableName}/indexes/{indexName}Security
Provide your bearer token in the Authorization header when making requests to protected resources.
Example: Authorization: Bearer YOUR_API_KEY
Code Examples
curl -X GET "/api/v1/tables/{tableName}/indexes/{indexName}" \
-H "Authorization: Bearer YOUR_API_KEY"const response = await fetch('/api/v1/tables/{tableName}/indexes/{indexName}', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const data = await response.json();fetch('/api/v1/tables/{tableName}/indexes/{indexName}', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
})
.then(response => response.json())
.then(data => console.log(data));import requests
headers = {
'Authorization': 'Bearer YOUR_API_KEY'
}
response = requests.get('/api/v1/tables/{tableName}/indexes/{indexName}', headers=headers)
data = response.json()package main
import (
"bytes"
"encoding/json"
"net/http"
)
func main() {
req, _ := http.NewRequest("GET", "/api/v1/tables/{tableName}/indexes/{indexName}", nil)
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
client := &http.Client{}
resp, _ := client.Do(req)
defer resp.Body.Close()
}Responses
{
"shard_status": {},
"config": {
"mem_only": true
},
"status": {
"error": "string",
"total_indexed": 0,
"disk_usage": 0,
"rebuilding": true
}
}