10M Vectors. 4GB RAM. Zero Training. Meet turbovec

10M Vectors. 4GB RAM. Zero Training. Meet turbovec

Vector search shouldn’t cost you 30GB of RAM, a separate training phase, or recall hits when you add filters. If you’re building RAG systems that care about memory, latency, or privacy, FAISS is starting to feel heavy.

turbovec

turbovec is a Rust-native vector index with Python bindings that compresses embeddings 8–16x, skips the train step entirely, and consistently beats FAISS on search speed.

Why it’s different

  • Faster than FAISS: Hand-tuned NEON & AVX-512 kernels beat IndexPQFastScan by 12–20% on ARM, and match or beat it on x86.
  • 31GB → 4GB: A 10M-vector corpus shrinks from float32 to ~4GB with 4-bit quantization. No recall cliff.
  • Zero training, online ingest: Add vectors anytime. No codebook training, no rebuilds, no parameter tuning. The index grows with your data.
  • Native filtering at search time: Pass an allowlist to .search() and the SIMD kernel skips disallowed blocks before scoring. No over-fetching. No recall penalty.
  • 100% local: No managed service. No telemetry. Pair with any open embedding model for a fully air-gapped RAG stack.

How to use it

pip install turbovec
from turbovec import TurboQuantIndex

# No train phase. Just index.
index = TurboQuantIndex(dim=1536, bit_width=4)
index.add(vectors)
index.add(more_vectors)  # online ingest

scores, ids = index.search(query, k=10)

# Need stable external IDs & O(1) deletes?
from turbovec import IdMapIndex
idx = IdMapIndex(dim=1536, bit_width=4)
idx.add_with_ids(vectors, ids)
idx.search(query, k=10, allowlist=sql_candidate_ids)  # hybrid retrieval

Drop into your stack

Swap your in-memory vector store in one line. Same API, same pipeline wiring:

  • pip install turbovec[langchain]
  • pip install turbovec[llama-index]
  • pip install turbovec[haystack]
  • pip install turbovec[agno]
RAGFlow, No More Fake Answers, Use Your Real Data
What is RAGFlow? Imagine you have a mountain of documents, PDFs, Word files, Excel sheets, images, web pages, and you need to find the exact piece of information inside them. That’s where RAGFlow comes in. It’s an open-source tool that acts like a super-smart assistant for your AI.

How it works (in a sentence)

turbovec runs Google Research’s TurboQuant algorithm: normalize → apply a fixed random rotation (making coordinate distributions predictable) → precomputed Lloyd-Max scalar quantization → length-renormalized scoring. The math replaces codebook training. SIMD replaces decompression. Result: distortion within 2.7x of Shannon’s lower bound, with zero data dependency.

If your vector index is eating RAM, slowing down under filters, or demanding a train phase you don’t need, downgrade the footprint and upgrade the speed.

pip install turbovec and ship it. 🔍💨

GitHub - RyanCodrai/turbovec: A vector index built on TurboQuant, written in Rust with Python bindings
A vector index built on TurboQuant, written in Rust with Python bindings - RyanCodrai/turbovec

Read more