huggingface-gemma-recipes
是 Hugging Face 官方维护的一个开源项目,旨在为用户提供与 Google Gemma 系列模型相关的最小化示例代码和教程。该项目的核心目标是帮助开发者快速上手 Gemma 模型的推理、微调和各种实际应用场景。
该项目支持 Gemma 3 系列模型的多模态能力:
项目提供了统一的模型推理接口,支持快速加载和使用 Gemma 模型:
from transformers import AutoProcessor, AutoModelForImageTextToText
import torch
model_id = "google/gemma-3n-e4b-it" # 或 google/gemma-3n-e2b-it
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id).to(device)
def model_generation(model, messages):
inputs = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
)
input_len = inputs["input_ids"].shape[-1]
inputs = inputs.to(model.device, dtype=model.dtype)
with torch.inference_mode():
generation = model.generate(**inputs, max_new_tokens=32, disable_compile=False)
generation = generation[:, input_len:]
decoded = processor.batch_decode(generation, skip_special_tokens=True)
print(decoded[0])
# 文本问答
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "What is the capital of France?"}
]
}
]
model_generation(model, messages)
# 语音转文本
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Transcribe the following speech segment in English:"},
{"type": "audio", "audio": "https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/speech.wav"},
]
}
]
model_generation(model, messages)
# 图像描述
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "https://huggingface.co/datasets/ariG23498/demo-data/resolve/main/airplane.jpg"},
{"type": "text", "text": "Describe this image."}
]
}
]
model_generation(model, messages)
项目提供了多种微调方案和脚本:
# 安装依赖
$ pip install -U -q -r requirements.txt
# 安装核心依赖
$ pip install -U -q transformers timm
# 安装完整依赖(用于微调)
$ pip install -U -q -r requirements.txt
huggingface-gemma-recipes/
├── notebooks/ # Jupyter 笔记本教程
│ └── fine_tune_gemma3n_on_t4.ipynb
├── scripts/ # 微调脚本
│ ├── ft_gemma3n_image_vt.py
│ ├── ft_gemma3n_audio_vt.py
│ └── ft_gemma3n_image_trl.py
├── requirements.txt # 依赖列表
└── README.md # 项目说明
该项目作为 Hugging Face 官方维护的开源项目,具有以下优势:
huggingface-gemma-recipes
是一个高质量的开源项目,为 Gemma 模型的使用提供了完整的解决方案。无论是初学者还是有经验的开发者,都能从中找到适合的资源和指导。项目的多模态支持和灵活的微调方案使其成为当前 AI 开发领域的重要工具之一。