Skip to content

Latest commit

 

History

History
289 lines (204 loc) · 6.66 KB

others.md

File metadata and controls

289 lines (204 loc) · 6.66 KB
description
Other Functions in PyCaret

Others

pull

Returns the last printed scoring grid. Use pull function after any training function to store the scoring grid in pandas.DataFrame.

Example

# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')

# compare models
best_model = compare_models()

# get the scoring grid
results = pull()

Output from pull()

type(results)
>>> pandas.core.frame.DataFrame

models

Return a table containing all the models available in the imported module of the model library.

Example

# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')

# check model library
models()

Output from models()

If you want to see a little more information than this, you can pass internal=True.

# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')

# check model library
models(internal = True)

Output from models(internal = True)

get_config

This function retrieves the global variables created when initializing the setup function.

Example

# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')

# get X_train
get_config('X_train')

Output from get_config('X_train')

Variables accessible by get_config function:

  • X: Transformed dataset (X)
  • y: Transformed dataset (y)
  • X_train: Transformed train dataset (X)
  • X_test: Transformed test/holdout dataset (X)
  • y_train: Transformed train dataset (y)
  • y_test: Transformed test/holdout dataset (y)
  • seed: random state set through session_id
  • prep_pipe: Transformation pipeline
  • fold_shuffle_param: shuffle parameter used in Kfolds
  • n_jobs_param: n_jobs parameter used in model training
  • html_param: html_param configured through setup
  • create_model_container: results grid storage container
  • master_model_container: model storage container
  • display_container: results display container
  • exp_name_log: Name of experiment
  • logging_param: log_experiment param
  • log_plots_param: log_plots param
  • USI: Unique session ID parameter
  • fix_imbalance_param: fix_imbalance param
  • fix_imbalance_method_param: fix_imbalance_method param
  • data_before_preprocess: data before preprocessing
  • target_param: name of target variable
  • gpu_param: use_gpu param configured through setup
  • fold_generator: CV splitter configured in fold_strategy
  • fold_param: fold params defined in the setup
  • fold_groups_param: fold groups defined in the setup
  • stratify_param: stratify parameter defined in the setup

set_config

This function resets the global variables.

Example

# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)

# reset environment seed
set_config('seed', 999) 

get_metrics

Returns the table of all the available metrics in the metric container. All these metrics are used for cross-validation.

# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)

# get metrics
get_metrics()

Output from get_metrics()

add_metric

Adds a custom metric to the metric container.

# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)

# add metric
from sklearn.metrics import log_loss
add_metric('logloss', 'Log Loss', log_loss, greater_is_better = False)

Output from add_metric('logloss', 'Log Loss', log_loss, greater_is_better = False)

Now if you check metric container:

get_metrics()

Output from get_metrics() (after adding log loss metric)

remove_metric

Removes a metric from the metric container.

# remove metric
remove_metric('logloss')

No Output. Let's check the metric container again.

get_metrics()

Output from get_metrics() (after removing log loss metric)

automl

This function returns the best model out of all trained models in the current setup based on the optimize parameter. Metrics evaluated can be accessed using the get_metrics function.

Example

# load dataset 
from pycaret.datasets import get_data 
data = get_data('diabetes') 

# init setup 
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable') 

# compare models
top5 = compare_models(n_select = 5) 

# tune models
tuned_top5 = [tune_model(i) for i in top5]

# ensemble models
bagged_top5 = [ensemble_model(i) for i in tuned_top5]

# blend models
blender = blend_models(estimator_list = top5) 

# stack models
stacker = stack_models(estimator_list = top5) 

# automl 
best = automl(optimize = 'Recall')
print(best)

Output from print(best)

get_logs

Returns a table of experiment logs. Only works when log_experiment = True when initializing the setup function.

Example

# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', log_experiment = True, experiment_name = 'diabetes1')

# compare models
top5 = compare_models()

# check ML logs
get_logs()

Output from get_logs()

get_system_logs

Read and print logs.log file from current active directory.

Example

# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)

# check system logs
from pycaret.utils import get_system_logs
get_system_logs()