Creates a new RaptorTree.
Configuration including LLM caller, embedding manager, vector store, and clustering parameters.
const raptor = new RaptorTree({
llmCaller: myLlm,
embeddingManager: myEmbeddings,
vectorStore: myStore,
clusterSize: 8,
maxDepth: 4,
});
Builds the RAPTOR tree from a set of leaf chunks.
Pipeline for each layer:
Stops when:
minChunksForLayer summaries were producedLeaf chunks to build the tree from.
Statistics about the constructed tree.
If embedding or storage fails critically.
const chunks = documents.map((doc, i) => ({
id: `chunk-${i}`,
text: doc.content,
metadata: { source: doc.source },
}));
const stats = await raptor.build(chunks);
console.log(`Tree has ${stats.totalLayers} layers`);
Searches ALL layers of the RAPTOR tree simultaneously.
This is the key advantage of RAPTOR: a detail query will match leaf chunks, while a thematic query will match higher-layer summaries. Both types of results are returned together, sorted by relevance.
The search query.
Optional topK: number = 10Maximum number of results across all layers.
Results from all layers, sorted by score.
If embedding or vector search fails.
const results = await raptor.search('authentication architecture', 10);
// May return:
// - Layer 0 chunks about specific auth implementations
// - Layer 1 summaries about auth patterns
// - Layer 2 high-level summary about security architecture
Returns statistics about the last tree build.
Tree statistics including layer counts, node counts, cluster counts, and build time.
const stats = raptor.getStats();
console.log(`Layers: ${stats.totalLayers}, Nodes: ${stats.totalNodes}`);
RAPTOR — Recursive Abstractive Processing for Tree-Organized Retrieval.
Builds a hierarchical summary tree over document chunks, enabling retrieval at multiple levels of abstraction. Leaf nodes contain original chunks while higher layers contain progressively more abstract summaries.
Example: Building and searching a RAPTOR tree