Index Management#

Create and manage indexes for full-text search, vector similarity search, and multimodal content.

Index Types#

Antfly supports two primary index types:

full_text_v0 - Full-Text Search Index#

BM25-based full-text search using Bleve. Automatically uses schema information for field mappings.

Configuration:

{
  "type": "full_text_v0",
  "name": "search_idx",
  "mem_only": false
}

Options:

  • mem_only (optional): If true, stores index in memory only (useful for testing)

Features:

  • BM25 ranking algorithm
  • Field-specific queries (title:computer)
  • Boolean operators (AND, OR, NOT)
  • Range queries (year:>2020)
  • Phrase queries ("exact phrase")
  • Uses schema's x-antfly-types for field analyzers
  • Respects x-antfly-include-in-all for cross-field search

aknn_v0 - Vector Similarity Index#

Approximate k-nearest neighbors (AKNN) index for semantic similarity search using vector embeddings.

Performance Features:

  • Hardware-accelerated distance calculations using SIMD instructions
  • x86/AMD: AVX2 and AVX-512 optimizations for vector operations
  • ARM: NEON and SME (Scalable Matrix Extension) acceleration
  • RaBitQ quantization for reduced memory footprint and faster search
  • No GPU required - optimized for CPU-based deployments

Basic Configuration:

{
  "type": "aknn_v0",
  "name": "embedding_idx",
  "dimension": 384,
  "embedder": {
    "provider": "ollama",
    "model": "all-minilm",
    "url": "http://localhost:11434"
  }
}

Required Fields:

  • embedder: Embedder configuration for generating vectors

Optional Fields:

  • dimension: Vector dimension (inferred from embedder if not specified)
  • field: Field to embed using JSONPath syntax (e.g., "description", "metadata.content", "$.user.bio")
  • template: Handlebars template for custom prompt generation
  • summarizer: Vision/multimodal model for processing images/audio/video
  • mem_only: Store index in memory only

Embedder Providers#

Antfly supports multiple embedding providers via the embedder configuration:

Ollama (Local Models)#

{
  "provider": "ollama",
  "model": "all-minilm",
  "url": "http://localhost:11434"
}

Popular models: all-minilm, nomic-embed-text, mxbai-embed-large

OpenAI#

{
  "provider": "openai",
  "model": "text-embedding-3-small",
  "api_key": "sk-..."
}

Models: text-embedding-3-small (1536d), text-embedding-3-large (3072d), text-embedding-ada-002 (1536d)

Google Gemini#

{
  "provider": "gemini",
  "model": "text-embedding-004",
  "project_id": "my-project",
  "location": "us-central1",
  "dimension": 768
}

Supports multimodal embeddings (text + images).

AWS Bedrock#

{
  "provider": "bedrock",
  "model": "amazon.titan-embed-text-v1",
  "region": "us-east-1"
}

Models: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2, cohere.embed-english-v3, cohere.embed-multilingual-v3

Multimodal Indexing#

Vector indexes can process images, audio, and video using vision/multimodal models.

With Summarizer (Two-Stage)#

Use a vision model to generate text descriptions, then embed the descriptions:

{
  "type": "aknn_v0",
  "dimension": 384,
  "embedder": {
    "provider": "ollama",
    "model": "all-minilm"
  },
  "summarizer": {
    "provider": "ollama",
    "model": "llava"
  }
}

How it works: Vision model (llava) generates text description → Embedder (all-minilm) creates vector

Supported Vision Models:

  • Ollama: llava, llava-phi3, bakllava, llava-llama3
  • OpenAI: gpt-4o, gpt-4-turbo, gpt-4o-mini
  • Gemini: gemini-2.0-flash-exp, gemini-1.5-pro, gemini-1.5-flash
  • Anthropic: claude-3-5-sonnet-20241022, claude-3-opus-20240229
  • Bedrock: anthropic.claude-3-5-sonnet-20241022-v2:0

Native Multimodal (Single-Stage)#

Some models support direct multimodal embeddings:

{
  "type": "aknn_v0",
  "dimension": 1024,
  "embedder": {
    "provider": "gemini",
    "model": "text-embedding-004",
    "project_id": "my-project"
  }
}

Gemini's embedding models can directly process text and images without a separate vision model.

Supported Media Formats#

When using multimodal indexes, documents can reference media in multiple ways:

Images#

  • Base64 encoded (data URI): data:image/jpeg;base64,<base64>, data:image/png;base64,<base64>, data:image/webp;base64,<base64>
  • URLs: http://, https://, s3://, file://

Audio & Video#

  • URLs: http://, https://, s3://, file://

Example document with image:

{
  "title": "Product Photo",
  "image": "data:image/jpeg;base64,iVBORw0KGgoAAAANSUhEUgA..."
}

Template-Based Prompts#

Customize how documents are converted to text before embedding using Handlebars templates:

{
  "type": "aknn_v0",
  "dimension": 384,
  "field": "description",
  "template": "Product: {{title}}\nCategory: {{category}}\nDescription: {{description}}",
  "embedder": {
    "provider": "ollama",
    "model": "all-minilm"
  }
}

Template System:

  • Syntax: Handlebars templating (https://handlebarsjs.com/guide/)
  • Caching: Templates are automatically cached with configurable TTL (default: 5 minutes)
  • Context: Templates receive the full document as context

Basic Syntax:

  • {{fieldName}}: Insert field value
  • {{#if field}}...{{/if}}: Conditional logic
  • {{#each array}}...{{/each}}: Iterate arrays
  • {{#unless condition}}...{{/unless}}: Inverse conditional
  • {{@index}}, {{@key}}, {{@first}}, {{@last}}: Loop variables

Built-in Helpers:

  1. scrubHtml - Remove script/style tags and extract clean text from HTML

    {{scrubHtml html_content}}
    • Removes <script> and <style> tags
    • Adds newlines after block elements (p, div, h1-h6, li, etc.)
    • Returns plain text with preserved readability
  2. eq - Equality comparison for conditionals

    {{#if (eq status "active")}}Active{{/if}}
    {{#if (eq @key "special")}}Special field{{/if}}
  3. media - GenKit dotprompt media directive for multimodal content

    {{media url=imageDataURI}}
    {{media url=this.image_url}}
    {{media url="https://example.com/image.jpg"}}
    {{media url="s3://endpoint/bucket/image.png"}}
    {{media url="file:///path/to/image.jpg"}}

    Supported URL Schemes:

    • data: - Base64 encoded data URIs (e.g., data:image/jpeg;base64,...)
    • http:// / https:// - Web URLs with automatic content type detection
    • file:// - Local filesystem paths
    • s3:// - S3-compatible storage (format: s3://endpoint/bucket/key)

    Automatic Content Processing:

    • Images: Downloaded, resized (if needed), converted to data URIs
    • PDFs: Text extracted or first page rendered as image
    • HTML: Readable text extracted using Mozilla Readability

    Security Controls: Downloads are protected by content security settings:

    • Allowed host whitelist
    • Private IP blocking (prevents SSRF attacks)
    • Download size limits (default: 100MB)
    • Download timeouts (default: 30s)
    • Image dimension limits (default: 2048px, auto-resized)

    See: https://antfly.io/docs/configuration#security--cors

  4. encodeToon - Encode data in TOON format (Token-Oriented Object Notation)

    {{encodeToon this.fields}}
    {{encodeToon this.fields lengthMarker=false indent=4}}
    {{encodeToon this.fields delimiter="\t"}}

    TOON Format provides 30-60% token reduction compared to JSON:

    • Compact syntax using : for key-value pairs
    • Array length markers: tags[#3]: ai,search,ml
    • Tabular format for uniform data structures
    • Optimized for LLM parsing

    Options:

    • lengthMarker (bool): Add # prefix to array counts (default: true)
    • indent (int): Indentation spacing (default: 2)
    • delimiter (string): Field separator for tabular arrays (use "\t" for tabs)

    Example output:

    title: Introduction to Vector Search
    author: Jane Doe
    tags[#3]: ai,search,ml

    References:

Without a template, the field value is used directly, or the entire document is serialized.

Field Extraction with JSONPath#

The field parameter supports JSONPath syntax for extracting values from nested document structures:

{
  "type": "aknn_v0",
  "field": "metadata.description",
  "embedder": {"provider": "ollama", "model": "all-minilm"}
}

JSONPath Examples:

  • "title" - Top-level field
  • "user.bio" - Nested field (auto-prepends $. if not present)
  • "$.metadata.content" - Explicit JSONPath
  • "$.items[0].name" - Array indexing

The extracted value must be a string. Use template for complex transformations or combining multiple fields.

Configuration Examples#

Simple Text Embedding#

{
  "type": "aknn_v0",
  "field": "content",
  "embedder": {
    "provider": "ollama",
    "model": "all-minilm"
  }
}

Note: dimension is omitted and will be inferred from the all-minilm model (384).

Multimodal with Custom Template#

{
  "type": "aknn_v0",
  "dimension": 768,
  "template": "{{title}}: {{description}}",
  "embedder": {
    "provider": "gemini",
    "model": "text-embedding-004",
    "dimension": 768
  },
  "summarizer": {
    "provider": "gemini",
    "model": "gemini-1.5-flash"
  }
}

Production OpenAI Setup#

{
  "type": "aknn_v0",
  "dimension": 1536,
  "field": "body",
  "embedder": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "api_key": "${OPENAI_API_KEY}"
  }
}

Best Practices#

  • Choose dimension wisely: Higher dimensions improve accuracy but increase storage/compute costs
  • Use templates for structured data: Combine multiple fields into meaningful prompts
  • Multimodal strategy: Use two-stage (summarizer + embedder) for flexibility, single-stage for simplicity
  • Local vs Cloud: Ollama for development/privacy, cloud providers for production/scale
  • Field specification: Specify field when embedding a single field to avoid whole-document serialization

To implement autocomplete/search-as-you-type functionality:

1. Define schema with search_as_you_type type:

{
  "document_schemas": {
    "product": {
      "schema": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "x-antfly-types": ["text", "search_as_you_type"]
          }
        }
      }
    }
  }
}

2. Create full-text index:

{
  "indexes": {
    "search_idx": {
      "type": "full_text_v0"
    }
  }
}

3. Query the __2gram field:

{
  "full_text_search": {
    "query": "name__2gram:lap"
  }
}

The search_as_you_type type creates an edge n-gram indexed field (suffixed with __2gram) that matches partial input as users type. For example, "lap" matches "laptop", "lapel", etc.

Reindexing Data#

What happens when you need to reindex? Antfly handles reindexing through async enrichment, automatically rebuilding indexes in the background. Here's what to expect when reindexing data (after adding a new index or changing configuration):

Adding a New Index to Existing Table#

Antfly handles reindexing automatically through async enrichment:

1. Add the index:

POST /tables/{tableName}/indexes/{indexName}
{
  "type": "aknn_v0",
  "embedder": {...}
}

2. Enrichment happens automatically:

  • Existing documents are enriched in the background
  • New writes are enriched immediately
  • Leader-only processing prevents duplicate work
  • Check index status to monitor progress

3. Monitor progress:

GET /tables/{tableName}/indexes/{indexName}

Look at shard_status for enrichment progress per shard.

Error Handling for Async Enrichment#

When embedding generation or enrichment fails, Antfly handles errors gracefully:

Retry Behavior:

  • Failed enrichments are automatically retried with exponential backoff
  • Leader-only processing ensures no duplicate work during retries
  • Enrichment continues for other documents even if some fail

Monitoring Failures:

  • Check index status (GET /tables/{table}/indexes/{index}) for error counts
  • Prometheus metrics expose enrichment failure rates
  • Failed documents remain searchable via full-text index (if configured)

Common Failure Scenarios:

  1. Embedding API Unavailable

    • Enrichment retries automatically when service recovers
    • Documents remain indexed (without embeddings) for full-text search
    • Use sync_level: enrichments for synchronous embedding (fails write if embedding fails)
  2. Rate Limiting

    • Automatic backoff and retry
    • Consider using local Ollama to avoid rate limits
    • Or increase API quota with embedding provider
  3. Model Errors (invalid input, size limits)

    • Logged with document ID for investigation
    • Retries stop after max attempts
    • Document remains in table without enrichment

Fallback Strategy:

For critical workloads, use the failover merge strategy to fall back to full-text search when embedding generation fails:

{
  "full_text_search": {"query": "search term"},
  "semantic_search": "search term",
  "merge_config": {"strategy": "failover"}
}

This ensures queries always return results even if embeddings are unavailable.

Changing Index Configuration#

To change index configuration:

  1. Drop the existing index: DELETE /tables/{table}/indexes/{index}
  2. Recreate with new config: POST /tables/{table}/indexes/{index}
  3. Existing data is automatically re-enriched

Note: Full-text indexes (full_text_v0) reindex immediately on creation and automatically rebuild when schema changes are made. Vector indexes (aknn_v0) enrich asynchronously in the background.

Backup Before Major Reindexing#

For large datasets, backup before reindexing:

POST /tables/{tableName}/backup
{
  "backup_id": "before-reindex-2025-01",
  "location": "s3://mybucket/backups/products"
}

Graph Indexes and Edge TTL#

Graph indexes (graph_v0) enable relationship modeling and traversal queries. Like documents, graph edges support automatic expiration through per-index TTL configuration.

Graph Index Overview#

Graph indexes store directed edges between documents, enabling queries like:

  • Finding all connections from a document
  • Traversing multi-hop paths
  • Filtering relationships by type

Basic graph index:

{
  "type": "graph_v0",
  "name": "social_graph"
}

Edge TTL Configuration#

Configure automatic edge expiration per graph index using ttl_duration:

{
  "type": "graph_v0",
  "name": "recent_interactions",
  "ttl_duration": "7d"
}

How it works:

  • Edges expire after the specified duration from their creation timestamp
  • Background cleanup runs every 30 seconds on the Raft leader
  • Each edge has a timestamp key (:t suffix) tracking when it was created
  • Expired edges are filtered from traversal queries immediately
  • Deletions are batched (1000 edges at a time) and go through Raft consensus

Edge TTL vs Document TTL#

Document TTL (table-level):

  • Configured via ttl_duration in table schema
  • Applies to all documents in the table
  • Uses _timestamp or custom ttl_field

Edge TTL (index-level):

  • Configured via ttl_duration in graph index config
  • Applies only to edges in that specific graph index
  • Uses edge creation timestamp (stored in :t key)
  • Different graph indexes can have different TTL durations

Duration Format#

Edge TTL uses the same Go duration format as document TTL:

  • 30s - 30 seconds
  • 5m - 5 minutes
  • 24h - 24 hours
  • 7d - 7 days (treated as 168h)

Use Cases#

Session Graphs:

{
  "type": "graph_v0",
  "name": "active_sessions",
  "ttl_duration": "1h"
}

Track active user sessions; edges expire after 1 hour of inactivity.

Recent Activity Feed:

{
  "type": "graph_v0",
  "name": "recent_interactions",
  "ttl_duration": "7d"
}

Keep only the last week of user interactions; older edges automatically deleted.

Temporary Recommendations:

{
  "type": "graph_v0",
  "name": "trending_connections",
  "ttl_duration": "24h"
}

Track trending content connections; refresh daily by letting old edges expire.

Rate Limiting:

{
  "type": "graph_v0",
  "name": "api_calls",
  "ttl_duration": "1h"
}

Track API calls per user; edges expire after 1 hour for sliding window rate limiting.

Performance Characteristics#

Edge TTL cleanup uses the same optimized approach as document TTL:

Optimized Scanning:

  • Only scans edge timestamp keys (:t suffix)
  • No document deserialization required
  • O(1) per edge - just reads a timestamp
  • ~100-1000x faster than scanning full edge data

Query Filtering:

  • Single key lookup to check edge expiration
  • Minimal latency impact on traversal queries
  • Scales to millions of edges

Cleanup Behavior:

  • Runs every 30 seconds (same interval as document TTL)
  • Leader-only operation
  • 5-second grace period for replication
  • Batched deletions (1000 edges per batch)

Monitoring#

Edge TTL operations are logged separately from document TTL:

INFO  Starting edge TTL cleanup job indexes_with_ttl=3 cleanup_interval=30s
INFO  Cleaned up expired edges count=127 duration=89ms total_expired=5432

Metrics tracked per graph index:

  • Total edges expired since leader became active
  • Last cleanup duration
  • Scanned vs expired edge counts

Modifying Edge TTL#

Adding TTL to existing graph index:

  • Currently requires dropping and recreating the index
  • Edges are not preserved when index is dropped
  • Plan accordingly for production deployments

Removing TTL:

  • Drop and recreate index without ttl_duration
  • All edges become permanent

Changing duration:

  • Drop and recreate index with new ttl_duration
  • Existing edges are not preserved

Limitations#

  • Per-index configuration: Each graph index has its own TTL setting
  • Edge-level timestamps: Uses edge creation time (not document timestamps)
  • No per-edge TTL: All edges in a graph index share the same TTL duration
  • Index recreation required: Cannot modify TTL on existing graph index
  • Clock synchronization: Requires NTP for accurate expiration across cluster nodes

Best Practices#

Separate indexes for different TTLs:

{
  "indexes": {
    "permanent_relationships": {
      "type": "graph_v0"
    },
    "recent_activity": {
      "type": "graph_v0",
      "ttl_duration": "7d"
    },
    "realtime_events": {
      "type": "graph_v0",
      "ttl_duration": "1h"
    }
  }
}

Use multiple graph indexes when you need different expiration policies for different types of relationships.

Monitor cleanup logs:

  • Watch for high edge counts requiring cleanup
  • Adjust TTL duration if cleanup becomes a bottleneck
  • Consider shorter cleanup intervals for time-sensitive use cases

Test before production:

  • Verify edge expiration behavior matches expectations
  • Test cleanup performance with representative edge counts
  • Ensure traversal queries filter expired edges correctly
Common questions about this section
  • What's the difference between full_text_v0 and aknn_v0 indexes?
  • Which embedding provider should I use?
  • How do I index images and other media?
  • Can I add indexes after the table is created?

List all indexes for a table#

GET/tables/{tableName}/indexes

Security#

Provide your bearer token in the Authorization header when making requests to protected resources.

Example: Authorization: Bearer YOUR_API_KEY

Code Examples#

curl -X GET "/api/v1/tables/{tableName}/indexes" \
    -H "Authorization: Bearer YOUR_API_KEY"

Responses#

[
  {
    "shard_status": {},
    "config": {
      "mem_only": true
    },
    "status": {
      "error": "string",
      "total_indexed": 0,
      "disk_usage": 0,
      "rebuilding": true
    }
  }
]

Add an index to a table#

POST/tables/{tableName}/indexes/{indexName}

Security#

Provide your bearer token in the Authorization header when making requests to protected resources.

Example: Authorization: Bearer YOUR_API_KEY

Request Body#

Example:

{
    "mem_only": true
}

Code Examples#

curl -X POST "/api/v1/tables/{tableName}/indexes/{indexName}" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
    "mem_only": true
}'

Responses#

No response body

Drop an index from a table#

DELETE/tables/{tableName}/indexes/{indexName}

Security#

Provide your bearer token in the Authorization header when making requests to protected resources.

Example: Authorization: Bearer YOUR_API_KEY

Code Examples#

curl -X DELETE "/api/v1/tables/{tableName}/indexes/{indexName}" \
    -H "Authorization: Bearer YOUR_API_KEY"

Responses#

No response body

Get index details#

GET/tables/{tableName}/indexes/{indexName}

Security#

Provide your bearer token in the Authorization header when making requests to protected resources.

Example: Authorization: Bearer YOUR_API_KEY

Code Examples#

curl -X GET "/api/v1/tables/{tableName}/indexes/{indexName}" \
    -H "Authorization: Bearer YOUR_API_KEY"

Responses#

{
  "shard_status": {},
  "config": {
    "mem_only": true
  },
  "status": {
    "error": "string",
    "total_indexed": 0,
    "disk_usage": 0,
    "rebuilding": true
  }
}