RFC-0002: Tree and Forest Representation#
Status: Implemented
Created: 2025-12-15
Updated: 2026-01-02
Scope: GBDT model structure
Summary#
Trees use Structure-of-Arrays (SoA) layout for cache-efficient traversal.
Tree<L> is immutable for inference; MutableTree<L> supports construction.
Forest<L> manages collections of trees with multi-output support.
Why SoA?#
Tree traversal accesses node fields in sequence: is_leaf → split_index → threshold → children. SoA keeps each field contiguous:
Layout |
Cache Behavior |
|---|---|
Array-of-Structs |
Load unused fields (waste bandwidth) |
Struct-of-Arrays |
Sequential access to needed arrays |
Benchmark: SoA provides ~15% speedup in inference vs AoS layout due to better cache line utilization (fewer bytes loaded per node access).
Layers#
High Level#
Users interact with GBDTModel which wraps Forest:
let model = GBDTModel::train(&dataset, eval_set, config, seed)?;
let preds = model.predict(&test_data, n_threads);
Medium Level (Forest)#
pub struct Forest<L: LeafValue> {
trees: Vec<Tree<L>>,
tree_groups: Vec<u32>, // Maps tree → output group
n_groups: u32, // 1 = regression, K = multiclass
base_scores: Vec<f32>, // Per-group initial predictions
}
Multi-output: Each tree contributes to one output group. For K-class, trees round-robin: tree 0 → group 0, tree 1 → group 1, etc.
Tree Access#
impl<L: LeafValue> Forest<L> {
pub fn n_trees(&self) -> usize;
pub fn tree(&self, idx: usize) -> &Tree<L>;
pub fn trees(&self) -> impl Iterator<Item = &Tree<L>>;
pub fn tree_group(&self, idx: usize) -> u32;
}
Individual tree access is useful for debugging, explainability, and analysis.
Medium Level (Tree)#
pub struct Tree<L: LeafValue> {
// Core SoA arrays (indexed by NodeId)
split_indices: Box<[u32]>,
split_thresholds: Box<[f32]>,
left_children: Box<[u32]>,
right_children: Box<[u32]>,
default_left: Box<[bool]>,
is_leaf: Box<[bool]>,
leaf_values: Box<[L]>,
split_types: Box<[SplitType]>, // Numeric vs Categorical
// Categorical split storage
categories: CategoriesStorage,
// Optional for explainability
gains: Option<Box<[f32]>>,
covers: Option<Box<[f32]>>,
}
Why Box<[T]> not Vec<T>? Trees are immutable after construction.
Box<[T]> has smaller stack size (no capacity) and signals immutability.
Medium Level (MutableTree)#
pub struct MutableTree<L: LeafValue> {
// Same fields as Tree, but Vec for growth
split_indices: Vec<u32>,
// ...
}
impl<L: LeafValue> MutableTree<L> {
pub fn init_root(&mut self) -> NodeId;
pub fn apply_split(&mut self, node: NodeId, split: &SplitInfo) -> (NodeId, NodeId);
pub fn make_leaf(&mut self, node: NodeId, value: L);
pub fn freeze(self) -> Tree<L>;
}
Grower produces MutableTree, calls freeze() to get immutable Tree.
Low Level (TreeView Trait)#
pub trait TreeView {
type LeafValue: LeafValue;
fn n_nodes(&self) -> usize;
fn is_leaf(&self, node: NodeId) -> bool;
fn split_index(&self, node: NodeId) -> u32;
fn split_threshold(&self, node: NodeId) -> f32;
// ... traversal primitives
}
Both Tree and MutableTree implement TreeView, enabling generic
traversal code.
LeafValue Trait#
pub trait LeafValue: Clone + Default + Send + Sync {
fn accumulate(&mut self, other: &Self);
fn scale(&mut self, factor: f32);
}
pub struct ScalarLeaf(pub f32); // Standard case
Vector leaves exist but aren’t used—multi-output uses tree groups instead.
Categorical Storage#
pub struct CategoriesStorage {
categories: Box<[u32]>, // Packed bitsets
segments: Box<[(u32, u32)]>, // Per-node (start, len)
}
Bit set = category goes RIGHT. Memory-efficient: only allocate for categorical split nodes.
Files#
Path |
Contents |
|---|---|
|
|
|
|
|
|
|
|
|
|
Design Decisions#
DD-1: Separate Tree and MutableTree. Clear ownership: grower produces mutable, freezes to immutable. No runtime mutability checks.
DD-2: Tree groups over vector leaves. Each tree has scalar output, contributes to one group. Simpler, matches XGBoost/LightGBM, better parallelism.
DD-3: Categories go right. Convention matches XGBoost. Sorted partition algorithm (high g/h ratio → right) produces right-going sets naturally.
DD-4: Optional gains/covers. Only populated when:
Training with
store_node_stats: truein configLoading XGBoost/LightGBM models that include statistics
Not required for inference; needed for explainability (TreeSHAP, feature importance).
Testing Strategy#
Category |
Tests |
|---|---|
Tree structure |
Valid node indices, no orphans, proper leaf marking |
Traversal |
Reaches correct leaf for all samples |
Freeze correctness |
MutableTree → Tree preserves all data |
Categorical storage |
Bitset encode/decode roundtrip |
Multi-output |
Tree groups correctly assigned |