- What is AntflyDB?
- What are the key features of Antfly?
- How does Antfly compare to Elasticsearch?
- What use cases is Antfly good for?
- Does Antfly support vector search?
What is AntflyDB?
AntflyDB is a distributed, horizontally scalable document database built from the ground up for the AI era. It combines the reliability of proven distributed systems technology with cutting-edge vector search capabilities, making it the ideal choice for modern applications that need both traditional database features and AI-powered semantic search.
Core values
We're here to enable AI Engineers to experiment faster. From local development to deployment, Antfly is meant to scale with your needs. We're Platform Engineers with decades of experience maintaining databases, so we know Operational Simplicity is just as important to experimentation as local development. Thus Antfly prioritizes modern Operations practices with telemetry, autoscaling, and security as core to every feature. We built Antfly from the ground up to be a Kubernetes native deployment that you can run on your laptop or in your Cloud. For bigger teams, we imagine you'll be doing both!
For AI Engineers we're also prioritizing telemetry, scaling and security. We know AI Engineers need to monitor the efficacy of the models they're using and how their models perform in real world use cases. We believe these systems will need to be able to scale in terms of cost, size, latency and access patterns in many directions.
Key Features
Hybrid Search
- BM25 + Vector Similarity: Full-text search with vector embeddings using Reciprocal Rank Fusion (RRF)
- Multimodal: Index and search text, images, audio, and video
- Embedding Providers: Ollama, OpenAI, AWS Bedrock, Google Gemini, Anthropic
- Hardware Acceleration: SIMD optimizations (AVX2/AVX-512 on x86, NEON on ARM) and SME (Scalable Matrix Extension) for ARM
Distributed Consensus
- Multi-Raft Design: Separate consensus groups for metadata (single cluster) and storage (one per shard)
- Horizontal Sharding: Data automatically partitioned across shards with configurable replication
- Distributed Transactions: Atomic cross-shard writes using coordinator-based 2PC with intent-based locking
- Fine-Grained Durability: Control write consistency with sync levels (propose, write, full_text, enrichments, aknn)
AI-Native Features
- RAG: Streaming retrieval-augmented generation with Server-Sent Events
- Answer Agent: Intelligent query routing with automatic query generation
- Document Chunking: Semantic chunking service (Termite) with multi-tier caching
- Async Enrichment: Background embedding and summarization with leader-only processing
- Multimodal Remote Processing: Automatic download, processing, and security controls for images, PDFs, and HTML from URLs (http/https/s3/file) with SSRF prevention and content validation
Developer Experience
- Client Libraries: Auto-generated SDKs for Go, TypeScript, and Python from OpenAPI spec
- JSON Schema Extensions: Custom
x-antfly-*annotations for indexing - MongoDB-Style Transforms: In-place updates ($inc, $set, $push) without read-modify-write
- Linear Merge: Stateless import of sorted records from external sources
- Template Rendering: Handlebars templates for custom document rendering
- Graph Operations: Declarative queries with traversals and pathfinding
- TTL Support: Automatic expiration for documents (table-level) and edges (graph index-level)
Operations & Scalability
- Online Shard Splitting: Automatic reallocation and splitting of shards without downtime
- Kubernetes Operator: Official
antfly-operatorwith autoscaling and lifecycle management - Prometheus Metrics: Built-in monitoring endpoints for observability
- Zero-Downtime Upgrades: Rolling upgrades across Raft consensus groups
Flexible Indexing
- Index Types:
full_text_v0(BM25),aknn_v0(vector with RaBitQ),graph_v0(relationship traversal) - Enrichers: summaries, embeddings, chunking (async background processing)
- Query Processors:
chunker(semantic document splitting),pruner(result filtering),reranker(relevance re-scoring) - Multimodal: Vision model + embedder or native multimodal embeddings (Gemini)
- Pluggable Architecture: Easy to add custom index types via registry pattern
Use Cases
AntflyDB excels in scenarios where you need:
- Semantic Search Applications: Build intelligent search experiences that understand context and meaning
- RAG (Retrieval-Augmented Generation): Power your LLM applications with relevant context retrieval
- Knowledge Management Systems: Store and query large document collections with both keyword and conceptual search
- Content Recommendation Engines: Find similar content based on semantic understanding
- Multi-Modal Search: Combine traditional filters with AI-powered similarity matching
- Image Understanding: Process and search images using vision models for summarization and embedding
Why AntflyDB?
Unlike traditional databases that bolt on vector search as an afterthought, AntflyDB treats AI capabilities as first-class citizens. This means:
- Embedding models are deeply integrated into the query engine
- Vector indexes are as performant and reliable as traditional indexes
- Hybrid queries are optimized at the core engine level
- Scaling AI workloads is as simple as adding more nodes
Antfly abstracts the management of embeddings and indexing from you. You insert your data and Antfly handles building and maintaining embeddings off that data for you. This allows you to build, test and manage models without having to build infrastructure to manage embedding lifecycle.
- Upgrading embeddings to use a newer model
- Migrating embeddings off of one model to another
- Experimenting building different prompt context for embeddings
- Comparing performance across different prompts, models and dimensions
This also allows Antfly to offer cool features in the future (see our roadmap) like offering semantic extraction, text summarization, image summarization, and more from LLMs before embedding your documents (using our prompt templates feature with Handlebars) or building custom ML models to build your own embeddings.
Comparing AntflyDB to Alternatives
vs. Elasticsearch
Advantages of Antfly:
- Hybrid Search Built-in: Native BM25 + vector similarity with RRF (no plugins needed)
- Hardware Acceleration: SIMD (AVX2/AVX-512, NEON) and SME for fast vector operations
- Distributed Transactions: Atomic cross-shard writes with 2PC (Elasticsearch lacks this)
- Graph Operations: Built-in graph traversal and pathfinding
- Online Shard Splitting: Automatic reallocation without downtime
- Operational Simplicity: Single binary, fewer moving parts
- Kubernetes Native: Official
antfly-operatorwith autoscaling
When to use Elasticsearch instead:
- Need mature ecosystem with extensive plugins
- Require specific Elasticsearch features (percolator, suggester)
- Team has deep Elasticsearch expertise
vs. Pinecone / Weaviate / Milvus (Vector Databases)
Advantages of Antfly:
- Hybrid Search: Combines keyword (BM25) and semantic search for better RAG accuracy
- CPU-Optimized SIMD: Hardware-accelerated vector operations (AVX2/AVX-512, NEON, SME) without GPU requirement
- Full-Text Search: Built-in Bleve for traditional search use cases
- No Separate Services: All-in-one solution (no need for separate keyword search)
- Graph Relationships: Graph index for connected data
When to use vector-only databases instead:
- Pure vector similarity is sufficient (no keyword search needed)
- Need GPU acceleration for massive-scale vector search (billions of vectors)
vs. pgvector (Postgres Extension)
Advantages of Antfly:
- Purpose-Built for Search: Optimized for full-text and vector search workloads
- Hybrid Search: Native RRF merging of BM25 and vector results
- Horizontal Scaling: Multi-Raft sharding for distributed search
- RAG Optimizations: Built-in reranking, pruning, document rendering
When to use pgvector instead:
- Data already in Postgres and hybrid search not critical
- Need SQL joins and transactional guarantees
- Simpler deployment model (single database)
Known Limitations
Antfly is designed for high-performance search and retrieval workloads. Some features are intentionally not supported:
Not Supported:
- Read-Write Transactions: Only write-only transactions across shards (reads are not transactional)
- Distributed Joins: Use graph queries or denormalize data instead
- SQL Interface: Uses JSON API and Bleve query syntax instead
- ACID for Reads: Reads are eventually consistent across shards
Use Cases Where Antfly May Not Be the Best Fit:
- Traditional OLTP workloads requiring ACID transactions
- Complex analytical queries with joins and aggregations
- Applications requiring strong consistency for reads
Getting Started
Ready to dive in? Check out our Quickstart Guide to get AntflyDB running in minutes, or explore the Architecture to understand how it all works under the hood.