Class Lucene90HnswGraphBuilder
java.lang.Object
org.apache.lucene.backward_codecs.lucene90.Lucene90HnswGraphBuilder
Builder for HNSW graph. See
Lucene90OnHeapHnswGraph
for a gloss on the algorithm and the
meaning of the hyperparameters.
This class is preserved here only for tests.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final int
private final Lucene90BoundsChecker
private final RandomAccessVectorValues<float[]>
private static final long
Default random seed for level generation *(package private) final Lucene90OnHeapHnswGraph
static final String
A name for the HNSW component for the info-stream *private InfoStream
private final int
private final SplittableRandom
static long
Random seed for level generation; public to expose for testing *private final Lucene90NeighborArray
private final VectorSimilarityFunction
private final RandomAccessVectorValues<float[]>
-
Constructor Summary
ConstructorsConstructorDescriptionLucene90HnswGraphBuilder
(RandomAccessVectorValues<float[]> vectors, VectorSimilarityFunction similarityFunction, int maxConn, int beamWidth, long seed) Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph. -
Method Summary
Modifier and TypeMethodDescriptionprivate void
addDiverseNeighbors
(int node, NeighborQueue candidates) (package private) void
addGraphNode
(float[] value) Inserts a doc with vector value to the graphbuild
(RandomAccessVectorValues<float[]> vectors) Reads all the vectors from two copies of aRandomAccessVectorValues
.private boolean
diversityCheck
(float[] candidate, float score, Lucene90NeighborArray neighbors, RandomAccessVectorValues<float[]> vectorValues) private void
diversityUpdate
(Lucene90NeighborArray neighbors) private int
findNonDiverse
(Lucene90NeighborArray neighbors) private void
popToScratch
(NeighborQueue candidates) private void
selectDiverse
(Lucene90NeighborArray neighbors, Lucene90NeighborArray candidates) void
setInfoStream
(InfoStream infoStream) Set info-stream to output debugging information *
-
Field Details
-
DEFAULT_RAND_SEED
private static final long DEFAULT_RAND_SEEDDefault random seed for level generation *- See Also:
-
HNSW_COMPONENT
A name for the HNSW component for the info-stream *- See Also:
-
randSeed
public static long randSeedRandom seed for level generation; public to expose for testing * -
maxConn
private final int maxConn -
beamWidth
private final int beamWidth -
scratch
-
similarityFunction
-
vectorValues
-
random
-
bound
-
hnsw
-
infoStream
-
buildVectors
-
-
Constructor Details
-
Lucene90HnswGraphBuilder
public Lucene90HnswGraphBuilder(RandomAccessVectorValues<float[]> vectors, VectorSimilarityFunction similarityFunction, int maxConn, int beamWidth, long seed) throws IOException Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.- Parameters:
vectors
- the vectors whose relations are represented by the graph - must provide a different view over those vectors than the one used to add via addGraphNode.maxConn
- the number of connections to make when adding a new graph node; roughly speaking the graph fanout.beamWidth
- the size of the beam search to use when finding nearest neighbors.seed
- the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.- Throws:
IOException
-
-
Method Details
-
build
Reads all the vectors from two copies of aRandomAccessVectorValues
. Providing two copies enables efficient retrieval without extra data copying, while avoiding collision of the returned values.- Parameters:
vectors
- the vectors for which to build a nearest neighbors graph. Must be an independet accessor for the vectors- Throws:
IOException
-
setInfoStream
Set info-stream to output debugging information * -
addGraphNode
Inserts a doc with vector value to the graph- Throws:
IOException
-
addDiverseNeighbors
- Throws:
IOException
-
selectDiverse
private void selectDiverse(Lucene90NeighborArray neighbors, Lucene90NeighborArray candidates) throws IOException - Throws:
IOException
-
popToScratch
-
diversityCheck
private boolean diversityCheck(float[] candidate, float score, Lucene90NeighborArray neighbors, RandomAccessVectorValues<float[]> vectorValues) throws IOException - Parameters:
candidate
- the vector of a new candidate neighbor of a node nscore
- the score of the new candidate and node n, to be compared with scores of the candidate and n's neighborsneighbors
- the neighbors selected so farvectorValues
- source of values used for making comparisons between candidate and existing neighbors- Returns:
- whether the candidate is diverse given the existing neighbors
- Throws:
IOException
-
diversityUpdate
- Throws:
IOException
-
findNonDiverse
- Throws:
IOException
-