This is a cookiecutter template focused on AI, designed for model architecture development, dataset creation, pipeline development, and model deployment using several open and free state-of-the-art tools.
This project was developed with multi-model and multi-dataset studies and implementations in mind. It is designed to use /mlflow/, /dvc/, /pre-commit/, /git/, /docker/ or /podman/, /jupyter lab/, /hydra/, /bentoml/, pipenv, and, at your choice, various databases like duckdb or PostgreSQL in local or cloud environments.
The project manages its own environment variables through a .env file integrated with Hydra configurations, offering two development branches (development and production) at the Hydra level. The project also uses /pdoc/ to generate useful documentation in HTML format.
mindmap markdown[Root **folder_name**] markdown[**configs**] database duckdb mysql postgres sqlite mlflow development_mlflow production_mlflow optuna development_optuna production_optuna pipeline modelv1 type development_type production_type markdown[Source **short_title**] markdown[**datasets**] datasetV1 markfown[**deploy**] modelV1_deploy docker markdown[**models**] modelexample notebooks markdown[**train**] trainV1 steps dataset final processed raw docs
This project takes care of configuring all its dependencies and tools. However, it requires that you have the Python package manager (pip) and cookiecutter installed.
sudo apt install python3-pip git && \ pip install --upgrade pip && \ pip install --upgrade cookiecutter
there is two options that are recomended 1. On project folder python envioriment 2. git default branch as main
export PIPENV_VENV_IN_PROJECT=1 git config --global init.defaultBranch main
To instantiate a project, you can do it just typing
cookiecutter https://github.com/kascesar/artificial-inteligence-template.git
then follow the instruction.git
after cloning
chmod +x setup_hooks.sh && \ sh setup_hooks.sh
R: Anyone, whether a developer, data scientist, or machine learning engineer, who wants to have a clean, simple, scalable, and replicable development environment.
R: For developers using free and/or open-source MLOps and artificial intelligence tools like mlflow, optuna, bentoml, docker, tensorflow, etc … aimed at studying, developing, and deploying models to production.
R: At least have a moderate understanding of Python, MLflow, DVC, Git, and Hydra.