description |
---|
Other Functions in PyCaret |
Returns the last printed scoring grid. Use pull
function after any training function to store the scoring grid in pandas.DataFrame
.
# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')
# compare models
best_model = compare_models()
# get the scoring grid
results = pull()
type(results)
>>> pandas.core.frame.DataFrame
Return a table containing all the models available in the imported module of the model library.
# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')
# check model library
models()
If you want to see a little more information than this, you can pass internal=True
.
# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')
# check model library
models(internal = True)
This function retrieves the global variables created when initializing the setup function.
# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')
# get X_train
get_config('X_train')
Variables accessible by get_config
function:
- X: Transformed dataset (X)
- y: Transformed dataset (y)
- X_train: Transformed train dataset (X)
- X_test: Transformed test/holdout dataset (X)
- y_train: Transformed train dataset (y)
- y_test: Transformed test/holdout dataset (y)
- seed: random state set through session_id
- prep_pipe: Transformation pipeline
- fold_shuffle_param: shuffle parameter used in Kfolds
- n_jobs_param: n_jobs parameter used in model training
- html_param: html_param configured through setup
- create_model_container: results grid storage container
- master_model_container: model storage container
- display_container: results display container
- exp_name_log: Name of experiment
- logging_param: log_experiment param
- log_plots_param: log_plots param
- USI: Unique session ID parameter
- fix_imbalance_param: fix_imbalance param
- fix_imbalance_method_param: fix_imbalance_method param
- data_before_preprocess: data before preprocessing
- target_param: name of target variable
- gpu_param: use_gpu param configured through setup
- fold_generator: CV splitter configured in fold_strategy
- fold_param: fold params defined in the setup
- fold_groups_param: fold groups defined in the setup
- stratify_param: stratify parameter defined in the setup
This function resets the global variables.
# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)
# reset environment seed
set_config('seed', 999)
Returns the table of all the available metrics in the metric container. All these metrics are used for cross-validation.
# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)
# get metrics
get_metrics()
Adds a custom metric to the metric container.
# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)
# add metric
from sklearn.metrics import log_loss
add_metric('logloss', 'Log Loss', log_loss, greater_is_better = False)
Now if you check metric container:
get_metrics()
Removes a metric from the metric container.
# remove metric
remove_metric('logloss')
No Output. Let's check the metric container again.
get_metrics()
This function returns the best model out of all trained models in the current setup based on the optimize
parameter. Metrics evaluated can be accessed using the get_metrics
function.
# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable')
# compare models
top5 = compare_models(n_select = 5)
# tune models
tuned_top5 = [tune_model(i) for i in top5]
# ensemble models
bagged_top5 = [ensemble_model(i) for i in tuned_top5]
# blend models
blender = blend_models(estimator_list = top5)
# stack models
stacker = stack_models(estimator_list = top5)
# automl
best = automl(optimize = 'Recall')
print(best)
Returns a table of experiment logs. Only works when log_experiment = True
when initializing the setup function.
# load dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', log_experiment = True, experiment_name = 'diabetes1')
# compare models
top5 = compare_models()
# check ML logs
get_logs()
Read and print logs.log
file from current active directory.
# loading dataset
from pycaret.datasets import get_data
data = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data, target = 'Class variable', session_id = 123)
# check system logs
from pycaret.utils import get_system_logs
get_system_logs()