Elastic Introduces New Vector Storage Format DiskBBQ for More Efficient Vector Search

Elastic, the Search AI Company, announced DiskBBQ, a new disk-friendly vector search algorithm in Elasticsearch that delivers more efficient vector search at scale than traditional industry-standard search techniques used in many vector databases. DiskBBQ eliminates the need to keep entire vector indexes in memory, delivers predictable performance, and costs less.

 

Top Breaking News Of The Day

Hierarchical Navigable Small Worlds (HNSW) is the most commonly used search technique in vector databases because of its speed and accuracy in similarity search. However, it requires all vectors to reside in memory, which can be costly at large scale. DiskBBQ, available now in Elasticsearch 9.2, uses BBQ (Better Binary Quantization) to address this by compressing vectors efficiently and clustering them into compact partitions for selective disk reads. This reduces RAM usage, avoids spikes in data retrieval time, and improves system performance for data ingestion and organization.

“As AI applications scale, traditional vector storage formats force them to choose between slow indexing or significant infrastructure costs required to overcome memory limitations,” said Ajay Nair, general manager, Platform at Elastic. “DiskBBQ is a smarter, more scalable approach to high-performance vector search on very large datasets that accelerates both indexing and retrieval.”

In benchmark testing, DiskBBQ demonstrated a balance of speed, stability and efficiency that is ideal for large-scale vector search on lower-cost memory infrastructure and object storage. As a disk-friendly ANN algorithm, it requires far less memory than HNSW, which keeps the entire graph in RAM by offloading data to disk and reading only relevant vector clusters at query time. This design removes memory as a limiting factor, enabling Elasticsearch to scale to massive datasets limited only by CPU and disk.

DiskBBQ sustained query latencies of roughly 15 milliseconds while operating in as little as 100 MB of total memory, where traditional HNSW indexing could not run. As available memory increased, DiskBBQ’s performance scaled smoothly without the sharp latency cliffs typical of in-memory graph approaches.

Leave a Reply

Your email address will not be published. Required fields are marked *