Class HnswGraphBuilder

java.lang.Object
org.apache.lucene.util.hnsw.HnswGraphBuilder
All Implemented Interfaces:
HnswBuilder
Direct Known Subclasses:
InitializedHnswGraphBuilder, MergingHnswGraphBuilder

public class HnswGraphBuilder extends Object implements HnswBuilder
Builder for HNSW graph. See HnswGraph for a gloss on the algorithm and the meaning of the hyper-parameters.

Thread-safety: This class is NOT thread safe, it cannot be shared across threads, however, it IS safe for multiple HnswGraphBuilder to build the same graph, if the graph's size is known in the beginning (like when doing merge)

  • Field Details

    • DEFAULT_MAX_CONN

      public static final int DEFAULT_MAX_CONN
      Default number of maximum connections per node
      See Also:
    • DEFAULT_BEAM_WIDTH

      public static final int DEFAULT_BEAM_WIDTH
      Default number of the size of the queue maintained while searching during a graph construction.
      See Also:
    • HNSW_COMPONENT

      public static final String HNSW_COMPONENT
      A name for the HNSW component for the info-stream *
      See Also:
    • randSeed

      public static long randSeed
      Random seed for level generation; public to expose for testing *
    • M

      protected final int M
    • scorer

      protected final UpdateableRandomVectorScorer scorer
    • graphSearcher

      protected final HnswGraphSearcher graphSearcher
    • beamCandidates

      protected final HnswGraphBuilder.GraphBuilderKnnCollector beamCandidates
    • hnsw

      protected final OnHeapHnswGraph hnsw
    • hnswLock

      protected final org.apache.lucene.util.hnsw.HnswLock hnswLock
    • infoStream

      protected InfoStream infoStream
    • frozen

      protected boolean frozen
  • Constructor Details

    • HnswGraphBuilder

      protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, int graphSize) throws IOException
      Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.
      Parameters:
      scorerSupplier - a supplier to create vector scorer from ordinals.
      M - – graph fanout parameter used to calculate the maximum number of connections a node can have – M on upper layers, and M * 2 on the lowest level.
      beamWidth - the size of the beam search to use when finding nearest neighbors.
      seed - the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.
      graphSize - size of graph, if unknown, pass in -1
      Throws:
      IOException
    • HnswGraphBuilder

      protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, OnHeapHnswGraph hnsw) throws IOException
      Throws:
      IOException
    • HnswGraphBuilder

      protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int beamWidth, long seed, OnHeapHnswGraph hnsw) throws IOException
      Throws:
      IOException
    • HnswGraphBuilder

      protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, OnHeapHnswGraph hnsw, org.apache.lucene.util.hnsw.HnswLock hnswLock, HnswGraphSearcher graphSearcher) throws IOException
      Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.
      Parameters:
      scorerSupplier - a supplier to create vector scorer from ordinals.
      M - – graph fanout parameter used to calculate the maximum number of connections a node can have – M on upper layers, and M * 2 on the lowest level.
      beamWidth - the size of the beam search to use when finding nearest neighbors.
      seed - the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.
      hnsw - the graph to build, can be previously initialized
      Throws:
      IOException
  • Method Details