Unlock Your Dark Data with Antfly Swarm & Termite
Over 80% of enterprise data sits untapped -- too sensitive to send to cloud AI services, too complex for traditional databases. Antfly Swarm gives you a fully local AI data platform that runs on your laptop or VPS: your data never leaves your infrastructure, and you get production-grade search, RAG, and AI enrichment out of the box.
Antfly is the retrieval layer for unstructured and dark data. It indexes any data source and builds the retrieval system that allows humans or AI to find the right context.
Table of Contents
The Dark Data Problem
Every organization has a massive blind spot: dark data. These are the PDFs buried in shared drives, the call recordings gathering dust, the internal wikis that nobody searches, the customer support transcripts that never get analyzed. Industry analysts estimate that over 80% of enterprise data is unstructured and untapped -- invisible to the systems that could extract value from it.
Dark data is not a storage problem — it's a representation problem. The data is already stored. What's missing is the transformation layer that converts multimodal artifacts — PDFs, slide decks, call recordings, scanned documents — into structured, semantically coherent content that machines can reason over. Without that transformation, AI systems can't find the right context, and RAG pipelines fail.
Traditional systems can index filenames, timestamps, and basic metadata. They cannot extract slide structure, interpret diagrams, segment meeting transcripts by topic, understand speaker roles, or represent semantic relationships across modalities. This is why most enterprise knowledge remains dark.
The AI revolution was supposed to fix this. Large language models can now read documents, transcribe audio, and understand images. But there's a catch: to use most AI services, you have to send your data to someone else's servers. For healthcare records, financial documents, legal contracts, and proprietary research, that's a non-starter.
Why Local-First Matters
When you send data to a cloud embedding API like OpenAI's, your documents travel across the internet, get processed on third-party hardware, and -- even if the provider promises not to retain them -- you've lost cryptographic control. For many organizations, this violates compliance requirements (HIPAA, GDPR, SOC 2) and internal security policies.
A local-first approach means the entire AI pipeline runs on infrastructure you control. No API keys to manage, no usage-based billing surprises, no vendor lock-in, and most importantly -- zero data leaving your perimeter. This is what tools like Ollama pioneered for LLM inference. Antfly extends this philosophy to the entire data stack: storage, indexing, embedding, chunking, re-ranking, and search.
Solving Dark Data Requires a Retrieval Stack
Unlocking dark data isn't a single feature — it's a layered architecture. Each layer handles a different part of the problem:
Antfly Swarm handles layers 1 through 3 out of the box — and provides the API surface to build layer 4. This is what makes it a retrieval layer, not just a database.
What is Antfly Swarm?
At its core, Antfly is a retrieval database — it indexes any data source and builds the retrieval system that allows humans or AI to find the right context. Swarm is its single-binary deployment mode, packaging the entire platform — database, search engine, model runner, and API server — into one process you can run on your laptop, a Raspberry Pi, or a small VPS. Think of it as docker compose for your AI data stack, except it's a single Go binary with zero dependencies.
Minimum Requirements
| Resource | Minimum | Recommended |
|---|---|---|
| CPU | 2 cores | 4+ cores |
| RAM | 4 GB | 8+ GB |
| Storage | 20 GB | SSD recommended |
| OS | Linux, macOS, Windows (any) | |
Swarm mode handles automatic node discovery, data sharding, and all the distributed consensus machinery under the hood. For development, you get the exact same API surface as a production multi-node cluster -- so code you write locally works identically in production.
What is Termite?
If Ollama is "Docker for LLMs," then Termite is "Docker for RAG pipeline models." Termite is Antfly's built-in model garden and local ONNX runtime for the smaller, specialized models that power retrieval -- embedding models, re-rankers, chunkers, and classifiers. These models run locally using hardware-accelerated SIMD instructions (AVX-512, NEON, SME), so your data never leaves your machine.
What Termite Runs Locally
| Capability | What It Does | Example Model |
|---|---|---|
| Embeddings | Convert text/images to vectors | bge-small-en-v1.5 |
| Chunking | Semantic document splitting | Built-in with multi-tier caching |
| Re-ranking | Re-score results for relevance | Cross-encoder models |
| Classification | Categorize and tag documents | ONNX classifier models |
How It All Fits Together
Hover over each component to see how data flows through the local stack.
Getting Started
Getting Antfly Swarm running takes about 60 seconds. Download the binary for your platform and start it:
# Download the latest release
curl -fsSL https://get.antfly.io | sh
# Start Antfly in Swarm mode (single binary, zero config)
antfly swarm start
# Verify it's running
curl http://localhost:8080/healthThat's it. You now have AntflyDB with Termite running locally. The REST API is available at http://localhost:8080 and is fully OpenAPI-compatible, so you can use the auto-generated SDKs for Go, TypeScript, or Python.
Install the Python SDK
pip install antflyConnecting Your Dark Data
Antfly processes each modality differently — because a PDF is not a slide deck is not a call recording. Here's what happens under the hood when you load content:
The first step is creating a table and telling AntflyDB what kind of data you'll be storing and how to index it. Here we'll create a table for internal documents with hybrid search (BM25 + vector) and local embeddings via Termite:
# Create a table with hybrid search + local embeddings
curl -X POST http://localhost:8080/tables \
-H "Content-Type: application/json" \
-d '{
"name": "internal_docs",
"indexes": [{
"type": "aknn_v0",
"fields": ["content"],
"embedder": {
"provider": "termite",
"model": "bge-small-en-v1.5"
}
}, {
"type": "full_text_v0",
"fields": ["content", "title"]
}]
}'Notice the "provider": "termite" -- this tells AntflyDB to use the local ONNX model runner for embeddings instead of calling an external API. Your documents are embedded on your machine, stored on your machine, and indexed on your machine. Nothing leaves.
Insert Documents
Now insert your sensitive documents. Antfly handles chunking, embedding, and indexing automatically in the background:
from antfly import AntflyClient
client = AntflyClient("http://localhost:8080")
# Insert documents — Antfly handles embedding automatically
client.table("internal_docs").insert([
{
"title": "Q4 Financial Review",
"content": "Revenue grew 23% year-over-year...",
"department": "finance",
"classification": "confidential"
},
{
"title": "Patient Case Study #4821",
"content": "Patient presented with symptoms...",
"department": "medical",
"classification": "hipaa_protected"
}
])
# Antfly automatically:
# 1. Chunks large documents semantically (via Termite)
# 2. Generates embeddings locally (via Termite + ONNX)
# 3. Indexes for both BM25 keyword + vector similarity search
# 4. All processing happens on YOUR machineQuerying Your Dark Data
Antfly's hybrid search combines keyword matching (BM25) with semantic understanding (vector similarity) using Reciprocal Rank Fusion. This means you get exact matches when the user types precise terms, and conceptual matches when they describe what they're looking for:
# Hybrid search — combines keyword + semantic matching
results = client.table("internal_docs").search(
query="what was our revenue growth last quarter",
limit=5
)
for doc in results:
print(f"{doc['title']} (score: {doc['_score']:.3f})")
print(f" {doc['content'][:100]}...")
# Output:
# Q4 Financial Review (score: 0.847)
# Revenue grew 23% year-over-year...This query works even though we typed "revenue growth" and the document says "Revenue grew" -- the semantic embedding understands they mean the same thing. The BM25 component also boosts the result because "revenue" appears as an exact keyword match. Both signals get fused together via RRF.
Why Basic RAG Isn't Enough for Dark Data
Most RAG systems assume clean text, pre-chunked documents, and known structure. Dark data violates all of those assumptions. When you point a naive RAG pipeline at real enterprise content, it breaks in predictable ways:
A production-grade system needs hierarchical indexing, multimodal processing, cross-document linking, source-grounded citations, and continuous re-indexing. This is infrastructure, not a feature. It's why Antfly builds these capabilities into the database layer rather than leaving them to application code.
Building a Local RAG Pipeline
The real power comes when you combine Antfly's retrieval with an LLM for answering questions. Antfly has a built-in RAG endpoint that streams answers via Server-Sent Events. You can connect any LLM -- local (via Ollama) or remote:
# Fully local RAG: Antfly retrieval + Ollama generation
# Zero data leaves your machine
import requests
response = requests.post("http://localhost:8080/rag", json={
"table": "internal_docs",
"query": "Summarize our financial performance last quarter",
"generator": {
"provider": "ollama",
"model": "llama3.2"
},
"retriever": {
"limit": 5,
"reranker": True # Termite re-ranks for better context
}
}, stream=True)
# Stream the answer as it's generated
for chunk in response.iter_lines():
if chunk:
print(chunk.decode(), end="", flush=True)Cloud vs. Local: Side by Side
Here's how the typical cloud AI stack compares to Antfly Swarm for sensitive data workloads:
| Cloud AI Stack | Antfly Swarm | |
|---|---|---|
| Data residency | Third-party servers | Your machine only |
| Vendors needed | 3+ (DB, embeddings, vector store) | 1 binary |
| Compliance | Depends on vendor DPAs | Full control -- you own the infra |
| Cost model | Per-token / per-query billing | Fixed infra cost |
| Latency | Network round-trips | Local -- sub-50ms p99 |
| Offline capable | No | Yes |
| Embedding lifecycle | Manual management | Automatic (Termite) |
Use Cases: Where Dark Data Lives
Tradeoffs to Consider
Being transparent: a local-first approach isn't the right choice for every workload. Here's what to consider:
Model size vs. accuracy
Local embedding models (like bge-small) are smaller than cloud models (like OpenAI's text-embedding-3-large). For most retrieval tasks, the difference is negligible -- but for highly specialized domains, you may want to benchmark. Termite supports swapping models easily.
Hardware requirements
Running models locally requires CPU/RAM. Swarm mode needs at least 4GB RAM. For larger datasets with re-ranking, 8-16GB is recommended. No GPU required -- Termite uses SIMD (AVX-512, NEON) for hardware acceleration.
LLM generation
For the generation step (chat, answering), local LLMs via Ollama are capable but smaller than GPT-4 or Claude. You can always use Antfly's local retrieval with a cloud LLM -- only the final query context goes to the API, not your raw documents.
Scaling beyond one machine
Swarm mode is designed for development and small-to-medium deployments. For production scale, AntflyDB supports multi-node clusters with the same API surface. Code you write in Swarm works in production without changes.
What's Next
Antfly Swarm and Termite are available now. Here's how to get started and stay connected:
Query meaning, not files. Retrieve context, not documents.
Your data. Your infrastructure. Your AI.
Stop sending sensitive data to cloud APIs. Antfly is the retrieval layer for your unstructured data — running locally in a single binary.
Get Started Free