Statistics¶
This section documents the statistics components of the tree module.
statistics
¶
Scikit-learn tree module statistics interoperability for Nextmv.
This module provides functionality to integrate scikit-learn tree-based models with Nextmv statistics tracking.
FUNCTION | DESCRIPTION |
---|---|
DecisionTreeRegressorStatistics |
Convert a DecisionTreeRegressor model to Nextmv statistics format. |
DecisionTreeRegressorStatistics
¶
DecisionTreeRegressorStatistics(
model: DecisionTreeRegressor,
X: Iterable,
y: Iterable,
sample_weight: float = None,
run_duration_start: Optional[float] = None,
) -> Statistics
Create a Nextmv statistics object from a scikit-learn DecisionTreeRegressor model.
You can import the DecisionTreeRegressorStatistics
function directly from tree
:
Converts a trained scikit-learn DecisionTreeRegressor model into Nextmv statistics
format. The statistics include model depth, feature importances, number of leaves,
and model score. Additional custom metrics can be added by the user after this
function returns. The optional run_duration_start
parameter can be used to track
the total runtime of the modeling process.
PARAMETER | DESCRIPTION |
---|---|
|
The trained scikit-learn DecisionTreeRegressor model.
TYPE:
|
|
The input features used for scoring the model.
TYPE:
|
|
The target values used for scoring the model.
TYPE:
|
|
The sample weights used for scoring, by default None.
TYPE:
|
|
The timestamp when the model run started, typically from time.time(), by default None.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Statistics
|
A Nextmv statistics object containing model performance metrics. |
Examples:
>>> from sklearn.tree import DecisionTreeRegressor
>>> from nextmv_sklearn.tree import DecisionTreeRegressorStatistics
>>> import time
>>>
>>> # Record start time
>>> start_time = time.time()
>>>
>>> # Train model
>>> model = DecisionTreeRegressor(max_depth=5)
>>> model.fit(X_train, y_train)
>>>
>>> # Create statistics
>>> stats = DecisionTreeRegressorStatistics(
... model, X_test, y_test, run_duration_start=start_time
... )
>>>
>>> # Add additional metrics
>>> stats.result.custom["my_custom_metric"] = custom_value