Antfly Public API
Antfly is a distributed key-value store and vector search engine built on a multi-Raft consensus architecture. It combines the best of traditional databases with modern AI capabilities to deliver hybrid search at scale.
What Makes Antfly Unique
Hybrid Search
- BM25 + Vector Similarity: Full-text search with vector embeddings using Reciprocal Rank Fusion (RRF)
- Multimodal: Index and search text, images, audio, and video
- Embedding Providers: Ollama, OpenAI, AWS Bedrock, Google Gemini, Anthropic
- Hardware Acceleration: SIMD optimizations (AVX2/AVX-512 on x86, NEON on ARM) and SME (Scalable Matrix Extension) for ARM
Distributed Consensus
- Multi-Raft Design: Separate consensus groups for metadata (single cluster) and storage (one per shard)
- Horizontal Sharding: Data automatically partitioned across shards with configurable replication
- Distributed Transactions: Atomic cross-shard writes using coordinator-based 2PC with intent-based locking
- Fine-Grained Durability: Control write consistency with sync levels (propose, write, full_text, enrichments, aknn)
AI-Native Features
- RAG: Streaming retrieval-augmented generation with Server-Sent Events
- Retrieval Agent: Intelligent query routing with automatic query generation
- Document Chunking: Semantic chunking service (Termite) with multi-tier caching
- Async Enrichment: Background embedding and summarization with leader-only processing
- Multimodal Remote Processing: Automatic download, processing, and security controls for images, PDFs, and HTML from URLs (http/https/s3/file) with SSRF prevention and content validation
Developer Experience
- Client Libraries: Auto-generated SDKs for Go, TypeScript, and Python from OpenAPI spec
- JSON Schema Extensions: Custom
x-antfly-*annotations for indexing - MongoDB-Style Transforms: In-place updates ($inc, $set, $push) without read-modify-write
- Linear Merge: Stateless import of sorted records from external sources
- Template Rendering: Handlebars templates for custom document rendering
- Graph Operations: Declarative queries with traversals and pathfinding
- TTL Support: Automatic expiration for documents (table-level) and edges (graph index-level)
Operations & Scalability
- Online Shard Splitting: Automatic reallocation and splitting of shards without downtime
- Kubernetes Operator: Official
antfly-operatorwith autoscaling and lifecycle management - Prometheus Metrics: Built-in monitoring endpoints for observability
- Zero-Downtime Upgrades: Rolling upgrades across Raft consensus groups
Flexible Indexing
- Index Types:
full_text_v0(BM25),aknn_v0(vector with RaBitQ),graph_v0(relationship traversal) - Enrichers:
embeddingenricher,summarizeenricher,pipelineenricher(async background processing) - Query Processors:
chunker(semantic document splitting),pruner(result filtering),reranker(relevance re-scoring) - Multimodal: Vision model + embedder or native multimodal embeddings (Gemini)
- Pluggable Architecture: Easy to add custom index types via registry pattern
API Sections
Explore the Antfly API organized by functionality:
- Getting Started: Quick introduction to using the Antfly API.
- Cluster Management: Monitor and manage your Antfly cluster's health and status.
- Table Management: Manage tables in your Antfly cluster. Tables store your documents and support multiple indexes.
- Data Operations: Perform data operations including batch inserts/deletes, linear merge sync, backup/restore, and individual key lookups.
- Query Operations: Execute queries across tables using full-text search, vector similarity search, or hybrid approaches combining both with Reciprocal Rank Fusion (RRF).
- Index Management: Create and manage indexes for full-text search, vector similarity search, and multimodal content.
- User: Operations related to users
- Permission: Operations related to permissions
- API Key: Operations related to API key management