huggingface/diffusersView GitHub Homepage for Latest Official Releases

画像、動画、音声生成をサポートする最先端の拡散モデルライブラリ

Apache-2.0Pythondiffusershuggingface 30.9k Last Updated: September 25, 2025

🤗 Diffusers プロジェクト詳細

プロジェクト概要

🤗 Diffusers は、Hugging Face が開発した最先端の拡散モデルライブラリであり、画像、音声、さらには分子の3D構造の生成に特化しています。単純な推論ソリューションを探している場合でも、独自の拡散モデルをトレーニングしたい場合でも、🤗 Diffusers は両方をサポートするモジュール式のツールボックスです。

プロジェクトアドレス: https://github.com/huggingface/diffusers

コア機能

設計理念

性能よりも実用性 (usability over performance)
容易さよりもシンプルさ (simple over easy)
抽象化よりもカスタマイズ性 (customizability over abstractions)

三大コアコンポーネント

最先端の拡散パイプライン (Diffusion Pipelines)
- わずか数行のコードで推論を実行可能
- 多様な生成タスクをサポート
交換可能なノイズスケジューラ (Noise Schedulers)
- さまざまな拡散速度をサポート
- 出力品質を調整可能
事前学習済みモデル (Pretrained Models)
- 構築ブロックとして使用可能
- スケジューラと組み合わせてエンドツーエンドの拡散システムを作成

インストール方法

PyTorch バージョン

# 公式パッケージ
pip install --upgrade diffusers[torch]

# コミュニティがメンテナンスする conda バージョン
conda install -c conda-forge diffusers

Flax バージョン

pip install --upgrade diffusers[flax]

クイックスタート

テキストから画像生成

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")
pipeline("An image of a squirrel in Picasso style").images[0]

カスタム拡散システム

from diffusers import DDPMScheduler, UNet2DModel
from PIL import Image
import torch

scheduler = DDPMScheduler.from_pretrained("google/ddpm-cat-256")
model = UNet2DModel.from_pretrained("google/ddpm-cat-256").to("cuda")
scheduler.set_timesteps(50)

sample_size = model.config.sample_size
noise = torch.randn((1, 3, sample_size, sample_size), device="cuda")
input = noise

for t in scheduler.timesteps:
    with torch.no_grad():
        noisy_residual = model(input, t).sample
    prev_noisy_sample = scheduler.step(noisy_residual, t, input).prev_sample
    input = prev_noisy_sample

image = (input / 2 + 0.5).clamp(0, 1)
image = image.cpu().permute(0, 2, 3, 1).numpy()[0]
image = Image.fromarray((image * 255).round().astype("uint8"))
image

サポートする主なタスクとモデル

タスク	パイプライン	おすすめモデル
無条件画像生成	DDPMPipeline	google/ddpm-ema-church-256
テキストから画像	StableDiffusionPipeline	stable-diffusion-v1-5/stable-diffusion-v1-5
テキストから画像 (unCLIP)	UnCLIPPipeline	kakaobrain/karlo-v1-alpha
テキストから画像 (DeepFloyd IF)	IFPipeline	DeepFloyd/IF-I-XL-v1.0
テキストから画像 (Kandinsky)	KandinskyPipeline	kandinsky-community/kandinsky-2-2-decoder
制御可能な生成	StableDiffusionControlNetPipeline	lllyasviel/sd-controlnet-canny
画像編集	StableDiffusionInstructPix2PixPipeline	timbrooks/instruct-pix2pix
画像から画像	StableDiffusionImg2ImgPipeline	stable-diffusion-v1-5/stable-diffusion-v1-5
画像修復	StableDiffusionInpaintPipeline	runwayml/stable-diffusion-inpainting
画像バリエーション	StableDiffusionImageVariationPipeline	lambdalabs/sd-image-variations-diffusers
画像超解像	StableDiffusionUpscalePipeline	stabilityai/stable-diffusion-x4-upscaler
潜在空間超解像	StableDiffusionLatentUpscalePipeline	stabilityai/sd-x2-latent-upscaler

ドキュメント構造

ドキュメントタイプ	学習内容
Tutorial	モデルとスケジューラを使用して拡散システムを構築したり、独自の拡散モデルをトレーニングするなど、ライブラリの基本的なスキルを学習します
Loading	ライブラリのすべてのコンポーネント（パイプライン、モデル、スケジューラ）をロードおよび構成する方法、およびさまざまなスケジューラの使用方法
Pipelines for inference	パイプラインを使用してさまざまな推論タスク、バッチ生成、生成出力とランダム性の制御を行う方法
Optimization	メモリが制限されたハードウェアでパイプラインを実行し、推論を高速化するためにパイプラインを最適化する方法
Training	さまざまなタスクのために独自の拡散モデルをトレーニングする方法

コミュニティエコシステム

統合プロジェクト

Microsoft TaskMatrix
InvokeAI
InstantID
Apple ML Stable Diffusion
Lama Cleaner
Grounded Segment Anything
Stable DreamFusion
DeepFloyd IF
BentoML
Kohya_ss

まとめ

🤗 Diffusers は、現在最も完全で使いやすい拡散モデルライブラリの1つです。豊富な事前学習済みモデルとパイプラインを提供するだけでなく、カスタムトレーニングと最適化もサポートしています。AI研究者、開発者、クリエイターのいずれであっても、このライブラリで必要なツールを見つけて、さまざまな生成AIアプリケーションを実現できます。