OptimalScale/LMFlowPlease refer to the latest official releases for information GitHub Homepage

An extensible, convenient, and efficient toolkit for fine-tuning and inference of large foundation models, designed for user-friendliness, speed, and reliability, and open to the entire community.

Apache-2.0Python 8.4kOptimalScale Last Updated: 2025-05-15

LMFlow - Large Foundation Model Fine-tuning and Inference Toolkit

Project Overview

LMFlow, developed by the OptimalScale team, is an open-source project and a scalable, convenient, and efficient toolkit for fine-tuning large machine learning models. Designed for user-friendliness, speed, and reliability, the project aims to make large language model technology accessible to the entire community, realizing the vision of "making large models serve everyone."

Project Address: https://github.com/OptimalScale/LMFlow

Core Features

1. Diverse Training Methods Supported

Full Parameter Fine-tuning: Updates all parameters to fine-tune the language model.
LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning algorithm that is more efficient than full parameter fine-tuning.
LISA (Layerwise Importance Sampling): A memory-efficient fine-tuning algorithm that can train 7B models in 24GB of VRAM without offloading.

2. Broad Model Support

Supports a variety of mainstream large language models, including:

DeepSeek Series: deepseek, deepseek_v2, deepseek_r1, etc.
LLaMA Series: llama2, llama3, llama3_for_tool
Qwen Series: qwen2, qwen2_for_tool, qwen2_5, etc.
Gemma, Phi, Yi, InternLM2 and other model architectures.

3. Performance Optimization Techniques

Memory Optimization

FlashAttention-2: Supports the latest FlashAttention technology, significantly improving training and inference speed.
Gradient Checkpointing: Optimizes memory usage through a strategy of trading computation for memory.
DeepSpeed Zero-3: Supports distributed training of large-scale models.

Inference Acceleration

vLLM Integration: Supports fast and easy-to-use LLM inference and serving.
Speculative Decoding: Supports speculative decoding technology to accelerate inference.
CPU Inference: Supports running LLaMA models on CPUs (through 4-bit quantization).

4. Rich Feature Set

Dialogue Template Support

Pre-sets the latest Llama-3 and Phi-3 dialogue templates.
Supports various common templates such as chatml.
Customizable dialogue templates for better performance.

Multi-modal Support

Supports multi-modal input of images and text.
Provides multi-modal chatbot functionality.
Online demo service available.

Long Context Handling

Supports position interpolation (linear and NTK scaling) for LLaMA models.
Extends the model's context handling capabilities.

5. Evaluation and Benchmarking

LMFlow Benchmark is an automated evaluation framework specifically designed for open-source large language models, using negative log-likelihood (NLL) as a metric to evaluate the model's capabilities in the following areas:

Chitchat conversation
Commonsense reasoning
Instruction following

Technical Innovation

RAFT Algorithm

The project proposes a new alignment algorithm: Reward rAnked FineTuning (RAFT), which is more efficient than traditional PPO-based RLHF.

Custom Optimizers

Supports training with a variety of custom optimizers, including:

RMSprop, LION-32bit, Adam, AdamW
AdaFactor, Adan, RAdam and over 20 other optimizers
The most suitable optimization strategy can be selected based on the specific task.

Real-world Application Cases

Breakthroughs in the Medical Field

Models trained with LMFlow perform exceptionally well in the medical field, with task-tuned models outperforming ChatGPT in medical applications, demonstrating significant potential in vertical domains.

Robin Model Series

The project has released several high-performance Robin models:

Robin-33B-V2: Achieved an excellent score of 64.1 on the Huggingface LLM leaderboard.
Provides checkpoints in multiple sizes: 7B, 13B, 33B, 65B, etc.

Installation and Usage

Environment Requirements

Primarily tested on Linux OS (Ubuntu 20.04).
Supports CUDA versions 10.3-11.7.
Python 3.9 environment.

Quick Installation

git clone -b v0.0.9 https://github.com/OptimalScale/LMFlow.git
cd LMFlow
conda create -n lmflow python=3.9 -y
conda activate lmflow
conda install mpi4py
pip install -e .

PyPI Installation

pip install lmflow-finetune

Technical Impact

The LMFlow project has had a significant impact in both academia and industry:

Related papers published in top academic conferences.
Gained significant attention and usage on GitHub.
Made significant contributions to the open-source large language model ecosystem.

Summary

LMFlow, as a comprehensive large language model toolkit, not only provides complete model training and inference solutions but also innovates in memory optimization, performance acceleration, model evaluation, and other aspects. It lowers the barrier to entry for using large language models, allowing more researchers and developers to easily build and deploy their own language models, truly realizing the goal of "making large models serve everyone."