LLaMA‑Factory is an open-source platform focused on fine-tuning, training, and deploying large language models (LLM/VLM). Released by Yaowei Zheng et al. at ACL 2024 and included in arXiv ([gitee.com][1]), the project highlights the following features:
Covers over a hundred models, including various sizes and architectures, from LLaMA and Phi to Qwen2-VL, Gemma, and DeepSeek.
Integrates common training processes: from pre-training and SFT to reward model training and PPO/DPO reinforcement learning.
Real-time viewing of training progress, metrics, and logs through Web UI (LLaMABoard), TensorBoard, Wandb, etc.
Supports exporting fine-tuned models in OpenAI API format and implementing concurrent inference (vLLM) or building a Gradio frontend.
pip install llama-factory # Or install from GitHub clone
CLI Mode:
llama-factory train \
--model llama-13b \
--dataset mydata \
--finetuning_type lora \
## Refer to the official documentation for more parameters
Web UI Mode:
CUDA_VISIBLE_DEVICES=0 python src/train_web.py
Start LLaMABoard for one-stop setting of training hyperparameters.
The project comes with 60+ datasets (data directory) and also supports custom JSON files, uniformly managed in dataset_info.json.
Automatically supports TensorBoard and Wandb display during training; can also be connected to MLflow, SwanLab, and other monitoring backends.
After training, directly generate a deployment package through CLI or export script, supporting concurrent inference and Gradio display.
LLaMA‑Factory is a feature-rich, easy-to-use, and technologically advanced LLM fine-tuning framework. Whether you are a researcher or an engineer, you can quickly customize, train, and deploy massive open-source models without writing complex code, making it a powerful tool for entering the field of LLM fine-tuning.