Stage 6: AI Project Practice and Production Deployment

A comprehensive machine learning engineering course that teaches how to integrate machine learning with software engineering, covering the entire process from experimentation to production deployment.

MLOpsMachineLearningProductionMLGitHubTextFreeEnglish

Made With ML Project Details

Project Overview

Made With ML is an open-source project created by Goku Mohandas, focused on teaching how to combine machine learning with software engineering to design, develop, deploy, and iterate production-grade machine learning applications. The project has become one of the top machine learning repositories on GitHub, with over 40,000 developers following it.

Project Goals and Features

Core Philosophy

The course iteratively builds reliable production systems, progressing from the experimentation phase (design + development) to the production phase (deployment + iteration).

Key Features

💡 First Principles: Establish a first-principles understanding of each machine learning concept before diving into the code.
💻 Best Practices: Implement software engineering best practices when developing and deploying machine learning models.
📈 Scaling: Easily scale machine learning workloads (data, training, tuning, serving) in Python without learning a completely new language.
⚙️ MLOps: Connect MLOps components (tracking, testing, serving, orchestration, etc.) to build end-to-end machine learning systems.
🚀 Development to Production: Learn how to move from development to production quickly and reliably without changing code or infrastructure management.
🐙 CI/CD: Learn how to create mature CI/CD workflows to continuously train and deploy better models in a modular way.

Target Audience

The project is aimed at various types of learners:

👩💻 All Developers: Whether software/infrastructure engineers or data scientists, machine learning is increasingly becoming a critical part of product development.
👩🎓 University Graduates: Learn practical skills needed in the industry, bridging the gap between university courses and industry expectations.
👩💼 Product/Leadership: Hoping to build a technical foundation to build amazing and reliable products powered by machine learning.

Project Structure and Content

Code Structure

The core code of the project is refactored into the following Python scripts:

madewithml
├── config.py
├── data.py
├── evaluate.py
├── models.py
├── predict.py
├── serve.py
├── train.py
├── tune.py
└── utils.py

Main Workflow

1. Environment Setup

The project supports multiple deployment environments:

Local Environment: Use a personal laptop as a cluster
Anyscale Platform: Use Anyscale Workspace for cloud development
Other Platforms: Support AWS, GCP, Kubernetes, local deployment, etc.

2. Data and Model Training

export EXPERIMENT_NAME="llm"
export DATASET_LOC="https://raw.githubusercontent.com/GokuMohandas/Made-With-ML/main/datasets/dataset.csv"
export TRAIN_LOOP_CONFIG='{"dropout_p": 0.5, "lr": 1e-4, "lr_factor": 0.8, "lr_patience": 3}'
python madewithml/train.py \
--experiment-name "$EXPERIMENT_NAME" \
--dataset-loc "$DATASET_LOC" \
--train-loop-config "$TRAIN_LOOP_CONFIG" \
--num-workers 1 \
--cpu-per-worker 3 \
--gpu-per-worker 1 \
--num-epochs 10 \
--batch-size 256 \
--results-fp results/training_results.json

3. Model Tuning

python madewithml/tune.py \
--experiment-name "$EXPERIMENT_NAME" \
--dataset-loc "$DATASET_LOC" \
--initial-params "$INITIAL_PARAMS" \
--num-runs 2 \
--num-workers 1 \
--cpu-per-worker 3 \
--gpu-per-worker 1 \
--num-epochs 10 \
--batch-size 256 \
--results-fp results/tuning_results.json

4. Model Evaluation

export RUN_ID=$(python madewithml/predict.py get-best-run-id --experiment-name $EXPERIMENT_NAME --metric val_loss --mode ASC)
python madewithml/evaluate.py \
--run-id $RUN_ID \
--dataset-loc $HOLDOUT_LOC \
--results-fp results/evaluation_results.json

5. Model Prediction

python madewithml/predict.py predict \
--run-id $RUN_ID \
--title "Transfer learning with transformers" \
--description "Using transformers for transfer learning on text classification tasks."

6. Model Serving

python madewithml/serve.py --run_id $RUN_ID

Experiment Tracking

The project uses MLflow for experiment tracking and model management:

export MODEL_REGISTRY=$(python -c "from madewithml import config; print(config.MODEL_REGISTRY)")
mlflow server -h 0.0.0.0 -p 8080 --backend-store-uri $MODEL_REGISTRY

Testing Framework

The project includes a comprehensive test suite:

# Code testing
python3 -m pytest tests/code --verbose --disable-warnings

# Data testing
pytest --dataset-loc=$DATASET_LOC tests/data --verbose --disable-warnings

# Model testing
pytest --run-id=$RUN_ID tests/model --verbose --disable-warnings

# Coverage testing
python3 -m pytest tests/code --cov madewithml --cov-report html --disable-warnings

Production Deployment

Anyscale Deployment

The project provides a complete Anyscale deployment solution:

Cluster Environment Configuration:

export CLUSTER_ENV_NAME="madewithml-cluster-env"
anyscale cluster-env build deploy/cluster_env.yaml --name $CLUSTER_ENV_NAME

Compute Configuration:

export CLUSTER_COMPUTE_NAME="madewithml-cluster-compute-g5.4xlarge"
anyscale cluster-compute create deploy/cluster_compute.yaml --name $CLUSTER_COMPUTE_NAME

Job Submission:

anyscale job submit deploy/jobs/workloads.yaml

Service Deployment:

anyscale service rollout -f deploy/services/serve_model.yaml

CI/CD Process

The project integrates GitHub Actions to implement automated deployment:

Workflow Trigger: Trigger workload workflow when creating a PR
Model Training and Evaluation: Automatically execute training and evaluation
Result Feedback: Directly display training and evaluation results in the PR
Automatic Deployment: Automatically deploy to the production environment after merging to the main branch

Core Learning Points

Tech Stack

Python: Core programming language
Ray: Distributed computing framework
MLflow: Experiment tracking and model management
Transformers: Deep learning models
FastAPI: API service framework
pytest: Testing framework
GitHub Actions: CI/CD platform

Machine Learning Engineering Best Practices

Code Organization: Modular project structure
Experiment Management: Systematic experiment tracking
Version Control: Version management of code and models
Testing Strategy: Comprehensive test coverage
Deployment Automation: CI/CD process integration
Monitoring and Maintenance: Continuous monitoring of the production environment

Project Value

Machine learning is not a separate industry but a powerful data mindset, not limited to any specific type of person. This project provides a complete learning path, from basic concepts to production deployment, helping learners master the full set of modern machine learning engineering skills.

Learning Resources

Online Course: https://madewithml.com/
Source Code: https://github.com/GokuMohandas/Made-With-ML
Interactive Notebook: notebooks/madewithml.ipynb
Live Bootcamp: Regularly held online bootcamps

Continuous Improvement

The project emphasizes the importance of continuous improvement. With the establishment of CI/CD workflows, you can focus on continuously improving models and easily scale to scheduled runs (cron), data pipelines, drift detection monitoring, online evaluation, etc.

This project provides machine learning practitioners with a comprehensive and practical learning resource, covering the entire process from conceptual understanding to production deployment, and is an excellent resource for learning modern MLOps.