Stage 6: AI Project Practice and Production Deployment
A comprehensive machine learning engineering course that teaches how to integrate machine learning with software engineering, covering the entire process from experimentation to production deployment.
Made With ML Project Details
Project Overview
Made With ML is an open-source project created by Goku Mohandas, focused on teaching how to combine machine learning with software engineering to design, develop, deploy, and iterate production-grade machine learning applications. The project has become one of the top machine learning repositories on GitHub, with over 40,000 developers following it.
Project Goals and Features
Core Philosophy
The course iteratively builds reliable production systems, progressing from the experimentation phase (design + development) to the production phase (deployment + iteration).
Key Features
- 💡 First Principles: Establish a first-principles understanding of each machine learning concept before diving into the code.
- 💻 Best Practices: Implement software engineering best practices when developing and deploying machine learning models.
- 📈 Scaling: Easily scale machine learning workloads (data, training, tuning, serving) in Python without learning a completely new language.
- ⚙️ MLOps: Connect MLOps components (tracking, testing, serving, orchestration, etc.) to build end-to-end machine learning systems.
- 🚀 Development to Production: Learn how to move from development to production quickly and reliably without changing code or infrastructure management.
- 🐙 CI/CD: Learn how to create mature CI/CD workflows to continuously train and deploy better models in a modular way.
Target Audience
The project is aimed at various types of learners:
- 👩💻 All Developers: Whether software/infrastructure engineers or data scientists, machine learning is increasingly becoming a critical part of product development.
- 👩🎓 University Graduates: Learn practical skills needed in the industry, bridging the gap between university courses and industry expectations.
- 👩💼 Product/Leadership: Hoping to build a technical foundation to build amazing and reliable products powered by machine learning.
Project Structure and Content
Code Structure
The core code of the project is refactored into the following Python scripts:
madewithml
├── config.py
├── data.py
├── evaluate.py
├── models.py
├── predict.py
├── serve.py
├── train.py
├── tune.py
└── utils.py
Main Workflow
1. Environment Setup
The project supports multiple deployment environments:
- Local Environment: Use a personal laptop as a cluster
- Anyscale Platform: Use Anyscale Workspace for cloud development
- Other Platforms: Support AWS, GCP, Kubernetes, local deployment, etc.
2. Data and Model Training
export EXPERIMENT_NAME="llm"
export DATASET_LOC="https://raw.githubusercontent.com/GokuMohandas/Made-With-ML/main/datasets/dataset.csv"
export TRAIN_LOOP_CONFIG='{"dropout_p": 0.5, "lr": 1e-4, "lr_factor": 0.8, "lr_patience": 3}'
python madewithml/train.py \
--experiment-name "$EXPERIMENT_NAME" \
--dataset-loc "$DATASET_LOC" \
--train-loop-config "$TRAIN_LOOP_CONFIG" \
--num-workers 1 \
--cpu-per-worker 3 \
--gpu-per-worker 1 \
--num-epochs 10 \
--batch-size 256 \
--results-fp results/training_results.json
3. Model Tuning
python madewithml/tune.py \
--experiment-name "$EXPERIMENT_NAME" \
--dataset-loc "$DATASET_LOC" \
--initial-params "$INITIAL_PARAMS" \
--num-runs 2 \
--num-workers 1 \
--cpu-per-worker 3 \
--gpu-per-worker 1 \
--num-epochs 10 \
--batch-size 256 \
--results-fp results/tuning_results.json
4. Model Evaluation
export RUN_ID=$(python madewithml/predict.py get-best-run-id --experiment-name $EXPERIMENT_NAME --metric val_loss --mode ASC)
python madewithml/evaluate.py \
--run-id $RUN_ID \
--dataset-loc $HOLDOUT_LOC \
--results-fp results/evaluation_results.json
5. Model Prediction
python madewithml/predict.py predict \
--run-id $RUN_ID \
--title "Transfer learning with transformers" \
--description "Using transformers for transfer learning on text classification tasks."
6. Model Serving
python madewithml/serve.py --run_id $RUN_ID
Experiment Tracking
The project uses MLflow for experiment tracking and model management:
export MODEL_REGISTRY=$(python -c "from madewithml import config; print(config.MODEL_REGISTRY)")
mlflow server -h 0.0.0.0 -p 8080 --backend-store-uri $MODEL_REGISTRY
Testing Framework
The project includes a comprehensive test suite:
# Code testing
python3 -m pytest tests/code --verbose --disable-warnings
# Data testing
pytest --dataset-loc=$DATASET_LOC tests/data --verbose --disable-warnings
# Model testing
pytest --run-id=$RUN_ID tests/model --verbose --disable-warnings
# Coverage testing
python3 -m pytest tests/code --cov madewithml --cov-report html --disable-warnings
Production Deployment
Anyscale Deployment
The project provides a complete Anyscale deployment solution:
- Cluster Environment Configuration:
export CLUSTER_ENV_NAME="madewithml-cluster-env"
anyscale cluster-env build deploy/cluster_env.yaml --name $CLUSTER_ENV_NAME
- Compute Configuration:
export CLUSTER_COMPUTE_NAME="madewithml-cluster-compute-g5.4xlarge"
anyscale cluster-compute create deploy/cluster_compute.yaml --name $CLUSTER_COMPUTE_NAME
- Job Submission:
anyscale job submit deploy/jobs/workloads.yaml
- Service Deployment:
anyscale service rollout -f deploy/services/serve_model.yaml
CI/CD Process
The project integrates GitHub Actions to implement automated deployment:
- Workflow Trigger: Trigger workload workflow when creating a PR
- Model Training and Evaluation: Automatically execute training and evaluation
- Result Feedback: Directly display training and evaluation results in the PR
- Automatic Deployment: Automatically deploy to the production environment after merging to the main branch
Core Learning Points
Tech Stack
- Python: Core programming language
- Ray: Distributed computing framework
- MLflow: Experiment tracking and model management
- Transformers: Deep learning models
- FastAPI: API service framework
- pytest: Testing framework
- GitHub Actions: CI/CD platform
Machine Learning Engineering Best Practices
- Code Organization: Modular project structure
- Experiment Management: Systematic experiment tracking
- Version Control: Version management of code and models
- Testing Strategy: Comprehensive test coverage
- Deployment Automation: CI/CD process integration
- Monitoring and Maintenance: Continuous monitoring of the production environment
Project Value
Machine learning is not a separate industry but a powerful data mindset, not limited to any specific type of person. This project provides a complete learning path, from basic concepts to production deployment, helping learners master the full set of modern machine learning engineering skills.
Learning Resources
- Online Course: https://madewithml.com/
- Source Code: https://github.com/GokuMohandas/Made-With-ML
- Interactive Notebook: notebooks/madewithml.ipynb
- Live Bootcamp: Regularly held online bootcamps
Continuous Improvement
The project emphasizes the importance of continuous improvement. With the establishment of CI/CD workflows, you can focus on continuously improving models and easily scale to scheduled runs (cron), data pipelines, drift detection monitoring, online evaluation, etc.
This project provides machine learning practitioners with a comprehensive and practical learning resource, covering the entire process from conceptual understanding to production deployment, and is an excellent resource for learning modern MLOps.