Documentation - Antfly Documentation

Common questions about this section

What is AntflyDB?
What are the key features of Antfly?
How does Antfly compare to Elasticsearch?
What use cases is Antfly good for?
Does Antfly support vector search?

What is AntflyDB?#

AntflyDB is a distributed, horizontally scalable document database built from the ground up for the AI era. It combines the reliability of proven distributed systems technology with cutting-edge vector search capabilities, making it the ideal choice for modern applications that need both traditional database features and AI-powered semantic search.

Core values#

We're here to enable AI Engineers to experiment faster. From local development to deployment, Antfly is meant to scale with your needs. We're Platform Engineers with decades of experience maintaining databases, so we know Operational Simplicity is just as important to experimentation as local development. Thus Antfly prioritizes modern Operations practices with telemetry, autoscaling, and security as core to every feature. We built Antfly from the ground up to be a Kubernetes native deployment that you can run on your laptop or in your Cloud. For bigger teams, we imagine you'll be doing both!

For AI Engineers we're also prioritizing telemetry, scaling and security. We know AI Engineers need to monitor the efficacy of the models they're using and how their models perform in real world use cases. We believe these systems will need to be able to scale in terms of cost, size, latency and access patterns in many directions.

Key Features#

Hybrid Search#

BM25 + Vector Similarity: Full-text search with vector embeddings using Reciprocal Rank Fusion (RRF)
Multimodal: Index and search text, images, audio, and video
Embedding Providers: Ollama, OpenAI, AWS Bedrock, Google Gemini, Anthropic
Hardware Acceleration: SIMD optimizations (AVX2/AVX-512 on x86, NEON on ARM) and SME (Scalable Matrix Extension) for ARM

Distributed Consensus#

Multi-Raft Design: Separate consensus groups for metadata (single cluster) and storage (one per shard)
Horizontal Sharding: Data automatically partitioned across shards with configurable replication
Distributed Transactions: Atomic cross-shard writes using coordinator-based 2PC with intent-based locking
Fine-Grained Durability: Control write consistency with sync levels (propose, write, full_text, enrichments, aknn)

AI-Native Features#

RAG: Streaming retrieval-augmented generation with Server-Sent Events
Answer Agent: Intelligent query routing with automatic query generation
Document Chunking: Semantic chunking service (Termite) with multi-tier caching
Async Enrichment: Background embedding and summarization with leader-only processing
Multimodal Remote Processing: Automatic download, processing, and security controls for images, PDFs, and HTML from URLs (http/https/s3/file) with SSRF prevention and content validation

Developer Experience#

Client Libraries: Auto-generated SDKs for Go, TypeScript, and Python from OpenAPI spec
JSON Schema Extensions: Custom x-antfly-* annotations for indexing
MongoDB-Style Transforms: In-place updates ($inc, $set, $push) without read-modify-write
Linear Merge: Stateless import of sorted records from external sources
Template Rendering: Handlebars templates for custom document rendering
Graph Operations: Declarative queries with traversals and pathfinding
TTL Support: Automatic expiration for documents (table-level) and edges (graph index-level)

Operations & Scalability#

Online Shard Splitting: Automatic reallocation and splitting of shards without downtime
Kubernetes Operator: Official antfly-operator with autoscaling and lifecycle management
Prometheus Metrics: Built-in monitoring endpoints for observability
Zero-Downtime Upgrades: Rolling upgrades across Raft consensus groups

Flexible Indexing#

Index Types: full_text (BM25), embeddings (vector with RaBitQ), graph (relationship traversal)
Enrichers: summaries, embeddings, chunking (async background processing)
Query Processors: chunker (semantic document splitting), pruner (result filtering), reranker (relevance re-scoring)
Multimodal: Vision model + embedder or native multimodal embeddings (Gemini)
Pluggable Architecture: Easy to add custom index types via registry pattern

Use Cases#

AntflyDB excels in scenarios where you need:

Semantic Search Applications: Build intelligent search experiences that understand context and meaning
RAG (Retrieval-Augmented Generation): Power your LLM applications with relevant context retrieval
Knowledge Management Systems: Store and query large document collections with both keyword and conceptual search
Content Recommendation Engines: Find similar content based on semantic understanding
Multi-Modal Search: Combine traditional filters with AI-powered similarity matching
Image Understanding: Process and search images using vision models for summarization and embedding

Why AntflyDB?#

Unlike traditional databases that bolt on vector search as an afterthought, AntflyDB treats AI capabilities as first-class citizens. This means:

Embedding models are deeply integrated into the query engine
Vector indexes are as performant and reliable as traditional indexes
Hybrid queries are optimized at the core engine level
Scaling AI workloads is as simple as adding more nodes

Antfly abstracts the management of embeddings and indexing from you. You insert your data and Antfly handles building and maintaining embeddings off that data for you. This allows you to build, test and manage models without having to build infrastructure to manage embedding lifecycle.

Upgrading embeddings to use a newer model
Migrating embeddings off of one model to another
Experimenting building different prompt context for embeddings
Comparing performance across different prompts, models and dimensions

This also allows Antfly to offer cool features in the future (see our roadmap) like offering semantic extraction, text summarization, image summarization, and more from LLMs before embedding your documents (using our prompt templates feature with Handlebars) or building custom ML models to build your own embeddings.

Comparing AntflyDB to Alternatives#

vs. Elasticsearch#

Advantages of Antfly:

Hybrid Search Built-in: Native BM25 + vector similarity with RRF (no plugins needed)
Hardware Acceleration: SIMD (AVX2/AVX-512, NEON) and SME for fast vector operations
Distributed Transactions: Atomic cross-shard writes with 2PC (Elasticsearch lacks this)
Graph Operations: Built-in graph traversal and pathfinding
Online Shard Splitting: Automatic reallocation without downtime
Operational Simplicity: Single binary, fewer moving parts
Kubernetes Native: Official antfly-operator with autoscaling

When to use Elasticsearch instead:

Need mature ecosystem with extensive plugins
Require specific Elasticsearch features (percolator, suggester)
Team has deep Elasticsearch expertise

vs. Pinecone / Weaviate / Milvus (Vector Databases)#

Advantages of Antfly:

Hybrid Search: Combines keyword (BM25) and semantic search for better RAG accuracy
CPU-Optimized SIMD: Hardware-accelerated vector operations (AVX2/AVX-512, NEON, SME) without GPU requirement
Full-Text Search: Built-in Bleve for traditional search use cases
No Separate Services: All-in-one solution (no need for separate keyword search)
Graph Relationships: Graph index for connected data

When to use vector-only databases instead:

Pure vector similarity is sufficient (no keyword search needed)
Need GPU acceleration for massive-scale vector search (billions of vectors)

vs. pgvector (Postgres Extension)#

Advantages of Antfly:

Purpose-Built for Search: Optimized for full-text and vector search workloads
Hybrid Search: Native RRF merging of BM25 and vector results
Horizontal Scaling: Multi-Raft sharding for distributed search
RAG Optimizations: Built-in reranking, pruning, document rendering

When to use pgvector instead:

Data already in Postgres and hybrid search not critical
Need SQL joins and transactional guarantees
Simpler deployment model (single database)

Known Limitations#

Antfly is designed for high-performance search and retrieval workloads. Some features are intentionally not supported:

Not Supported:

Read-Write Transactions: Only write-only transactions across shards (reads are not transactional)
Distributed Joins: Use graph queries or denormalize data instead
SQL Interface: Uses JSON API and Bleve query syntax instead
ACID for Reads: Reads are eventually consistent across shards

Use Cases Where Antfly May Not Be the Best Fit:

Traditional OLTP workloads requiring ACID transactions
Complex analytical queries with joins and aggregations
Applications requiring strong consistency for reads

Getting Started#

Ready to dive in? Check out our Quickstart Guide to get AntflyDB running in minutes, or explore the Architecture to understand how it all works under the hood.