huggingface/peftPlease refer to the latest official releases for information GitHub Homepage

🤗 PEFT is an advanced Parameter-Efficient Fine-Tuning library developed by Hugging Face, enabling low-cost fine-tuning of large models through techniques like LoRA and AdaLoRA.

Apache-2.0Python 18.8khuggingface Last Updated: 2025-06-19

🤗 PEFT - Parameter-Efficient Fine-Tuning Library Detailed Introduction

Project Overview

PEFT (Parameter-Efficient Fine-Tuning) is an advanced parameter-efficient fine-tuning library developed by Hugging Face. This project aims to address the high computational costs and massive storage requirements associated with fine-tuning large pre-trained models.

GitHub Address: https://github.com/huggingface/peft

Core Values and Advantages

1. Cost-Effectiveness

Significantly Reduced Computational Costs: Compared to traditional full-parameter fine-tuning, PEFT methods only require training a small fraction of the model parameters.
Significantly Reduced Storage Requirements: Fine-tuned model checkpoint files are typically only a few MB instead of several GB.
Optimized Memory Usage: Able to handle larger models under the same hardware conditions.

2. Performance Retention

Comparable to Full-Parameter Fine-Tuning: Achieves performance comparable to full fine-tuning on most tasks.
Avoids Catastrophic Forgetting: Protects the original knowledge of the base model and reduces the risk of overfitting.

3. Flexibility and Convenience

Multi-Task Adaptation: Can train multiple lightweight adapters for different tasks.
Seamless Integration: Perfectly integrated with ecosystems such as Transformers, Diffusers, and Accelerate.

Supported Fine-Tuning Methods

Main PEFT Techniques

LoRA (Low-Rank Adaptation)
- The most popular parameter-efficient fine-tuning method.
- Significantly reduces trainable parameters through low-rank matrix factorization.
- Typically only requires training 0.1%-1% of the original parameters.
AdaLoRA
- An improved version of LoRA.
- Adaptively adjusts the rank size for further efficiency optimization.
Prefix Tuning
- Adds learnable prefixes to the input sequence.
- Suitable for generation tasks.
P-Tuning v2
- An improved prompt tuning method.
- Adds learnable parameters to multiple layers.
IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)
- Adapts the model by inhibiting and amplifying internal activations.

Practical Application Effects

Memory Usage Comparison (A100 80GB GPU)

Model	Full-Parameter Fine-Tuning	PEFT-LoRA	PEFT-LoRA + DeepSpeed CPU Offload
T0_3B (3 billion params)	47.14GB GPU / 2.96GB CPU	14.4GB GPU / 2.96GB CPU	9.8GB GPU / 17.8GB CPU
mt0-xxl (12 billion params)	Out of Memory	56GB GPU / 3GB CPU	22GB GPU / 52GB CPU
bloomz-7b1 (7 billion params)	Out of Memory	32GB GPU / 3.8GB CPU	18.1GB GPU / 35GB CPU

Performance

Accuracy comparison on the Twitter Complaint Classification task:

Human Baseline: 89.7%
Flan-T5: 89.2%
LoRA-T0-3B: 86.3%

Installation and Quick Start

Installation

pip install peft

Basic Usage Example

from transformers import AutoModelForSeq2SeqLM
from peft import get_peft_config, get_peft_model, LoraConfig, TaskType

# Configure PEFT
model_name_or_path = "bigscience/mt0-large"
peft_config = LoraConfig(
    task_type=TaskType.SEQ_2_SEQ_LM, 
    inference_mode=False, 
    r=8, 
    lora_alpha=32, 
    lora_dropout=0.1
)

# Load and wrap the model
model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
model = get_peft_model(model, peft_config)

# View trainable parameters
model.print_trainable_parameters()
# Output: trainable params: 2359296 || all params: 1231940608 || trainable%: 0.19

Inference Usage

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

# Load the fine-tuned model
model = AutoPeftModelForCausalLM.from_pretrained("ybelkada/opt-350m-lora").to("cuda")
tokenizer = AutoTokenizer.from_pretrained("facebook/opt-350m")

# Perform inference
model.eval()
inputs = tokenizer("Your input text", return_tensors="pt")
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=50)

Ecosystem Integration

1. Transformers Integration

Supports various pre-trained model architectures.
Seamless training and inference workflows.
Automatic model configuration and optimization.

2. Diffusers Integration

Supports efficient fine-tuning of diffusion models.
Suitable for image generation, editing, and other tasks.
Significantly reduces training memory requirements.

3. Accelerate Integration

Supports distributed training.
Multi-GPU, TPU training optimization.
Consumer-grade hardware friendly.

4. TRL (Transformer Reinforcement Learning) Integration

Supports RLHF (Reinforcement Learning from Human Feedback).
DPO (Direct Preference Optimization).
Large model alignment training.

Application Scenarios

1. Large Language Model Fine-Tuning

Instruction fine-tuning.
Dialogue system optimization.
Specific domain adaptation.

2. Multimodal Models

Visual-language model fine-tuning.
Audio processing model adaptation.

3. Diffusion Models

Stable Diffusion personalization.
DreamBooth training.
Style transfer.

4. Reinforcement Learning

Policy model fine-tuning.
Reward model training.
Human preference alignment.

Technical Advantages and Innovations

1. Parameter Efficiency

Only train 0.1%-1% of the original parameters.
Maintain over 95% of the performance.
Checkpoint files reduced to 1/100 of the original size.

2. Memory Optimization

Significantly reduces GPU memory requirements.
Supports training large models on consumer-grade hardware.
Gradient checkpointing further optimizes memory.

3. Quantization Compatibility

Perfectly combined with 8-bit and 4-bit quantization.
QLoRA technology support.
Further reduces the hardware threshold.

4. Modular Design

Supports multiple PEFT methods.
Flexible configuration options.
Easy to extend new methods.

Community and Ecosystem

Official Resources

Documentation: https://huggingface.co/docs/peft
Model Hub: https://huggingface.co/PEFT
Example Notebooks: Covers various application scenarios.
Blog Tutorials: Detailed technical explanations and best practices.

Summary

🤗 PEFT is a revolutionary parameter-efficient fine-tuning library that not only solves the cost problem of large model fine-tuning but also maintains excellent performance. Whether for researchers or industrial developers, PEFT provides a cost-effective large model customization solution, promoting the democratization of AI technology.