Statistics¶

<Model>Statistics allows access to the statistics of an sklearn.<Model>. It is a convenience function that will collect a simple set of statistics.

Consider the same scripts shown in the model section, which fit and predict with the diabetes dataset. You may use the <Model>Statistics to obtain the statistics of the model. This convenience functionality is provided out of the box, but we recommend that you customize how the model is interpreted to extract statistics.

Dummy¶

Reference

Find the reference for the dummy.statistics module here.

import json

from nextmv_sklearn import dummy

# Model code here.

statistics = dummy.DummyRegressorStatistics(fit, X, y, run_duration_start=start_time)
print(json.dumps(statistics.to_dict(), indent=2))

Run the script:

python main.py
{
  "run": {
    "duration": 0.0009911060333251953
  },
  "result": {
    "custom": {
      "score": 0.0
    }
  },
  "series_data": {},
  "schema": "v1"
}

Ensemble¶

Reference

Find the reference for the ensemble.statistics module here.

import json

from nextmv_sklearn import ensemble

# Model code here.

statistics = ensemble.GradientBoostingRegressorStatistics(fit, X, y, run_duration_start=start_time)
print(json.dumps(statistics.to_dict(), indent=2))

Run the script:

python main.py
{
  "run": {
    "duration": 0.06733989715576172
  },
  "result": {
    "custom": {
      "depth": 3,
      "feature_importances_": [
        0.04838302190078237,
        0.013922652165423995,
        0.261283146271802,
        0.10170455309660177,
        0.028599216988454114,
        0.04562385785680753,
        0.040315277921146336,
        0.01589286353541712,
        0.402315668430028,
        0.04195974183353675
      ],
      "score": 0.7990392018966865
    }
  },
  "series_data": {},
  "schema": "v1"
}

import json

from nextmv_sklearn import ensemble

# Model code here.

statistics = ensemble.RandomForestRegressorStatistics(fit, X, y, run_duration_start=start_time)
print(json.dumps(statistics.to_dict(), indent=2))

Run the script:

python main.py
{
  "run": {
    "duration": 0.15362906455993652
  },
  "result": {
    "custom": {
      "feature_importances_": [
        0.05804547555574568,
        0.011942445816542004,
        0.26046692216365497,
        0.09831628970586574,
        0.04414312280601346,
        0.05823550762545449,
        0.05163041383919266,
        0.024326122944217127,
        0.323262101101696,
        0.06963159844161786
      ],
      "score": 0.9180285142096314
    }
  },
  "series_data": {},
  "schema": "v1"
}

Linear model¶

Reference

Find the reference for the linear_model.statistics module here.

import json

from nextmv_sklearn import linear_model

# Model code here.

statistics = linear_model.LinearRegressionStatistics(fit, X, y, run_duration_start=start_time)
print(json.dumps(statistics.to_dict(), indent=2))

Run the script:

python main.py
{
  "run": {
    "duration": 0.0011751651763916016
  },
  "result": {
    "custom": {
      "score": 0.5177484222203499
    }
  },
  "series_data": {},
  "schema": "v1"
}

Neural network¶

Reference

Find the reference for the neural_network.statistics module here.

import json

from nextmv_sklearn import neural_network

# Model code here.

statistics = neural_network.MLPRegressorStatistics(fit, X, y, run_duration_start=start_time)
print(json.dumps(statistics.to_dict(), indent=2))

Run the script:

$ python main.py -max_iter 2500
{
  "run": {
    "duration": 0.9213109016418457
  },
  "result": {
    "custom": {
      "score": 0.5105739729721858
    }
  },
  "series_data": {},
  "schema": "v1"
}

Tree¶

Reference

Find the reference for the tree.statistics module here.

import json

from nextmv_sklearn import tree

# Model code here.

statistics = tree.DecisionTreeRegressorStatistics(fit, X, y, run_duration_start=start_time)
print(json.dumps(statistics.to_dict(), indent=2))

Run the script:

$ python main.py
{
  "run": {
    "duration": 0.0031609535217285156
  },
  "result": {
    "custom": {
      "depth": 20,
      "feature_importances_": [
        0.03648268541156411,
        0.008967396431459116,
        0.23191566128391577,
        0.082693316388564,
        0.07347510143597435,
        0.056688846730468985,
        0.07264156168631583,
        0.013709115567881148,
        0.3487799461424682,
        0.07464636892138855
      ],
      "n_leaves": 433,
      "score": 1.0
    }
  },
  "series_data": {},
  "schema": "v1"
}