Antfly Public API#

Antfly is a distributed key-value store and vector search engine built on a multi-Raft consensus architecture. It combines the best of traditional databases with modern AI capabilities to deliver hybrid search at scale.

What Makes Antfly Unique#

  • BM25 + Vector Similarity: Full-text search with vector embeddings using Reciprocal Rank Fusion (RRF)
  • Multimodal: Index and search text, images, audio, and video
  • Embedding Providers: Ollama, OpenAI, AWS Bedrock, Google Gemini, Anthropic
  • Hardware Acceleration: SIMD optimizations (AVX2/AVX-512 on x86, NEON on ARM) and SME (Scalable Matrix Extension) for ARM

Distributed Consensus#

  • Multi-Raft Design: Separate consensus groups for metadata (single cluster) and storage (one per shard)
  • Horizontal Sharding: Data automatically partitioned across shards with configurable replication
  • Distributed Transactions: Atomic cross-shard writes using coordinator-based 2PC with intent-based locking
  • Fine-Grained Durability: Control write consistency with sync levels (propose, write, full_text, enrichments, aknn)

AI-Native Features#

  • RAG: Streaming retrieval-augmented generation with Server-Sent Events
  • Retrieval Agent: Intelligent query routing with automatic query generation
  • Document Chunking: Semantic chunking service (Termite) with multi-tier caching
  • Async Enrichment: Background embedding and summarization with leader-only processing
  • Multimodal Remote Processing: Automatic download, processing, and security controls for images, PDFs, and HTML from URLs (http/https/s3/file) with SSRF prevention and content validation

Developer Experience#

  • Client Libraries: Auto-generated SDKs for Go, TypeScript, and Python from OpenAPI spec
  • JSON Schema Extensions: Custom x-antfly-* annotations for indexing
  • MongoDB-Style Transforms: In-place updates ($inc, $set, $push) without read-modify-write
  • Linear Merge: Stateless import of sorted records from external sources
  • Template Rendering: Handlebars templates for custom document rendering
  • Graph Operations: Declarative queries with traversals and pathfinding
  • TTL Support: Automatic expiration for documents (table-level) and edges (graph index-level)

Operations & Scalability#

  • Online Shard Splitting: Automatic reallocation and splitting of shards without downtime
  • Kubernetes Operator: Official antfly-operator with autoscaling and lifecycle management
  • Prometheus Metrics: Built-in monitoring endpoints for observability
  • Zero-Downtime Upgrades: Rolling upgrades across Raft consensus groups

Flexible Indexing#

  • Index Types: full_text_v0 (BM25), aknn_v0 (vector with RaBitQ), graph_v0 (relationship traversal)
  • Enrichers: embeddingenricher, summarizeenricher, pipelineenricher (async background processing)
  • Query Processors: chunker (semantic document splitting), pruner (result filtering), reranker (relevance re-scoring)
  • Multimodal: Vision model + embedder or native multimodal embeddings (Gemini)
  • Pluggable Architecture: Easy to add custom index types via registry pattern

API Sections#

Explore the Antfly API organized by functionality:

  • Getting Started: Quick introduction to using the Antfly API.
  • Cluster Management: Monitor and manage your Antfly cluster's health and status.
  • Table Management: Manage tables in your Antfly cluster. Tables store your documents and support multiple indexes.
  • Data Operations: Perform data operations including batch inserts/deletes, linear merge sync, backup/restore, and individual key lookups.
  • Query Operations: Execute queries across tables using full-text search, vector similarity search, or hybrid approaches combining both with Reciprocal Rank Fusion (RRF).
  • Index Management: Create and manage indexes for full-text search, vector similarity search, and multimodal content.
  • User: Operations related to users
  • Permission: Operations related to permissions
  • API Key: Operations related to API key management